The goal of this paper is to make a strong point for the usage of dynamical
models when using reinforcement learning (RL) for feedback control of dynamical
systems governed by partial differential equations (PDEs). To breach the gap
between the immense promises we see in RL and the applicability in complex
engineering systems, the main challenges are the massive requirements in terms
of the training data, as well as the lack of performance guarantees. We present
a solution for the first issue using a data-driven surrogate model in the form
of a convolutional LSTM with actuation. We demonstrate that learning an
actuated model in parallel to training the RL agent significantly reduces the
total amount of required data sampled from the real system. Furthermore, we
show that iteratively updating the model is of major importance to avoid biases
in the RL training. Detailed ablation studies reveal the most important
ingredients of the modeling process. We use the chaotic Kuramoto-Sivashinsky
equation do demonstarte our findings.
( 2
min )
Due to the precautionary measures during the COVID-19 pandemic many
universities offered unproctored take-home exams. We propose methods to detect
potential collusion between students and apply our approach on event log data
from take-home exams during the pandemic. We find groups of students with
suspiciously similar exams. In addition, we compare our findings to a proctored
control group. By this, we establish a rule of thumb for evaluating which cases
are "outstandingly similar", i.e., suspicious cases.
( 2
min )
Recent applications of pattern recognition techniques on brain connectome
classification using functional connectivity (FC) neglect the non-Euclidean
topology and causal dynamics of brain connectivity across time. In this paper,
a deep probabilistic spatiotemporal framework developed based on variational
Bayes (DSVB) is proposed to learn time-varying topological structures in
dynamic brain FC networks for autism spectrum disorder (ASD) identification.
The proposed framework incorporates a spatial-aware recurrent neural network to
capture rich spatiotemporal patterns across dynamic FC networks, followed by a
fully-connected neural network to exploit these learned patterns for
subject-level classification. To overcome model overfitting on limited training
datasets, an adversarial training strategy is introduced to learn graph
embedding models that generalize well to unseen brain networks. Evaluation on
the ABIDE resting-state functional magnetic resonance imaging dataset shows
that our proposed framework significantly outperformed state-of-the-art methods
in identifying ASD. Dynamic FC analyses with DSVB learned embeddings reveal
apparent group difference between ASD and healthy controls in network profiles
and switching dynamics of brain states.
( 2
min )
Most works on the fairness of machine learning systems focus on the blind
optimization of common fairness metrics, such as Demographic Parity and
Equalized Odds. In this paper, we conduct a comparative study of several bias
mitigation approaches to investigate their behaviors at a fine grain, the
prediction level. Our objective is to characterize the differences between fair
models obtained with different approaches. With comparable performances in
fairness and accuracy, are the different bias mitigation approaches impacting a
similar number of individuals? Do they mitigate bias in a similar way? Do they
affect the same individuals when debiasing a model? Our findings show that bias
mitigation approaches differ a lot in their strategies, both in the number of
impacted individuals and the populations targeted. More surprisingly, we show
these results even apply for several runs of the same mitigation approach.
These findings raise questions about the limitations of the current group
fairness metrics, as well as the arbitrariness, hence unfairness, of the whole
debiasing process.
( 2
min )
The existence of external (``side'') semantic knowledge has been shown to
result in more expressive computational event models. To enable the use of side
information that may be noisy or missing, we propose a semi-supervised
information bottleneck-based discrete latent variable model. We reparameterize
the model's discrete variables with auxiliary continuous latent variables and a
light-weight hierarchical structure. Our model is learned to minimize the
mutual information between the observed data and optional side knowledge that
is not already captured by the new, auxiliary variables. We theoretically show
that our approach generalizes past approaches, and perform an empirical case
study of our approach on event modeling. We corroborate our theoretical results
with strong empirical experiments, showing that the proposed method outperforms
previous proposed approaches on multiple datasets.
( 2
min )
This paper examines the separation of wireless communication and radar
signals, thereby guaranteeing cohabitation and acting as a panacea to spectrum
sensing. First, considering that the channel impulse response was known by the
receivers (communication and radar), we showed that the optimizing beamforming
weights mitigate the interference caused by signals and improve the physical
layer security (PLS) of the system. Furthermore, when the channel responses
were unknown, we designed an interference filter as a low-complex noise and
interference cancellation autoencoder. By mitigating the interference on the
legitimate users, the PLS was guaranteed. Results showed that even for a low
signal-to-noise ratio, the autoencoder produces low root-mean-square error
(RMSE) values.
( 2
min )
Intrigued by the claims of emergent reasoning capabilities in LLMs trained on
general web corpora, in this paper, we set out to investigate their planning
capabilities. We aim to evaluate (1) how good LLMs are by themselves in
generating and validating simple plans in commonsense planning tasks (of the
type that humans are generally quite good at) and (2) how good LLMs are in
being a source of heuristic guidance for other agents--either AI planners or
human planners--in their planning tasks. To investigate these questions in a
systematic rather than anecdotal manner, we start by developing a benchmark
suite based on the kinds of domains employed in the International Planning
Competition. On this benchmark, we evaluate LLMs in three modes: autonomous,
heuristic and human-in-the-loop. Our results show that LLM's ability to
autonomously generate executable plans is quite meager, averaging only about 3%
success rate. The heuristic and human-in-the-loop modes show slightly more
promise. In addition to these results, we also make our benchmark and
evaluation tools available to support investigations by research community.
( 2
min )
Artificial neural networks are being proposed as models of parts of the
brain. The networks are compared to recordings of biological neurons, and good
performance in reproducing neural responses is considered to support the
model's validity. A key question is how much this system identification
approach tells us about brain computation. Does it validate one model
architecture over another? We evaluate the most commonly used comparison
techniques, such as a linear encoding model and centered kernel alignment, to
correctly identify a model by replacing brain recordings with known ground
truth models. System identification performance is quite variable; it also
depends significantly on factors independent of the ground truth architecture,
such as stimuli images. In addition, we show the limitations of using
functional similarity scores in identifying higher-level architectural motifs.
( 2
min )
Bilevel Optimization has witnessed notable progress recently with new
emerging efficient algorithms, yet it is underexplored in the Federated
Learning setting. It is unclear how the challenges of Federated Learning affect
the convergence of bilevel algorithms. In this work, we study Federated Bilevel
Optimization problems. We first propose the FedBiO algorithm that solves the
hyper-gradient estimation problem efficiently, then we propose FedBiOAcc to
accelerate FedBiO. FedBiO has communication complexity $O(\epsilon^{-1.5})$
with linear speed up, while FedBiOAcc achieves communication complexity
$O(\epsilon^{-1})$, sample complexity $O(\epsilon^{-1.5})$ and also the linear
speed up. We also study Federated Bilevel Optimization problems with local
lower level problems, and prove that FedBiO and FedBiOAcc converges at the same
rate with some modification.
( 2
min )
Sequential monitoring of high-dimensional nonlinear time series is studied
for a projection of the second-moment matrix, a problem interesting in its own
right and specifically arising in finance and deep learning. Open-end as well
as closed-end monitoring is studied under mild assumptions on the training
sample and the observations of the monitoring period. Asymptotics is based on
Gaussian approximations of projected partial sums allowing for an estimated
projection vector. Estimation is studied both for classical
non-$\ell_0$-sparsity as well as under sparsity. For the case that the optimal
projection depends on the unknown covariance matrix, hard- and soft-thresholded
estimators are studied. Applications in finance and training of deep neural
networks are discussed. The proposed detectors typically allow to reduce
dramatically the required computational costs as illustrated by monitoring
synthetic data.
( 2
min )
We introduce a boosting algorithm to pre-process data for fairness. Starting
from an initial fair but inaccurate distribution, our approach shifts towards
better data fitting while still ensuring a minimal fairness guarantee. To do
so, it learns the sufficient statistics of an exponential family with
boosting-compliant convergence. Importantly, we are able to theoretically prove
that the learned distribution will have a representation rate and statistical
rate data fairness guarantee. Unlike recent optimization based pre-processing
methods, our approach can be easily adapted for continuous domain features.
Furthermore, when the weak learners are specified to be decision trees, the
sufficient statistics of the learned distribution can be examined to provide
clues on sources of (un)fairness. Empirical results are present to display the
quality of result on real-world data.
( 2
min )
Energy efficient navigation constitutes an important challenge in electric
vehicles, due to their limited battery capacity. We employ a Bayesian approach
to model the energy consumption at road segments for efficient navigation. In
order to learn the model parameters, we develop an online learning framework
and investigate several exploration strategies such as Thompson Sampling and
Upper Confidence Bound. We then extend our online learning framework to the
multi-agent setting, where multiple vehicles adaptively navigate and learn the
parameters of the energy model. We analyze Thompson Sampling and establish
rigorous regret bounds on its performance in the single-agent and multi-agent
settings, through an analysis of the algorithm under batched feedback. Finally,
we demonstrate the performance of our methods via experiments on several
real-world city road networks.
( 2
min )
We prove that the Minimum Description Length learning rule exhibits tempered
overfitting. We obtain tempered agnostic finite sample learning guarantees and
characterize the asymptotic behavior in the presence of random label noise.
( 2
min )
We study the convergence rate of discretized Riemannian Hamiltonian Monte
Carlo on sampling from distributions in the form of $e^{-f(x)}$ on a convex
body $\mathcal{M}\subset\mathbb{R}^{n}$. We show that for distributions in the
form of $e^{-\alpha^{\top}x}$ on a polytope with $m$ constraints, the
convergence rate of a family of commonly-used integrators is independent of
$\left\Vert \alpha\right\Vert _{2}$ and the geometry of the polytope. In
particular, the implicit midpoint method (IMM) and the generalized Leapfrog
method (LM) have a mixing time of $\widetilde{O}\left(mn^{3}\right)$ to achieve
$\epsilon$ total variation distance to the target distribution. These
guarantees are based on a general bound on the convergence rate for densities
of the form $e^{-f(x)}$ in terms of parameters of the manifold and the
integrator. Our theoretical guarantee complements the empirical results of
[KLSV22], which shows that RHMC with IMM can sample ill-conditioned, non-smooth
and constrained distributions in very high dimension efficiently in practice.
( 2
min )
Monotonic linear interpolation (MLI) - on the line connecting a random
initialization with the minimizer it converges to, the loss and accuracy are
monotonic - is a phenomenon that is commonly observed in the training of neural
networks. Such a phenomenon may seem to suggest that optimization of neural
networks is easy. In this paper, we show that the MLI property is not
necessarily related to the hardness of optimization problems, and empirical
observations on MLI for deep neural networks depend heavily on biases. In
particular, we show that interpolating both weights and biases linearly leads
to very different influences on the final output, and when different classes
have different last-layer biases on a deep network, there will be a long
plateau in both the loss and accuracy interpolation (which existing theory of
MLI cannot explain). We also show how the last-layer biases for different
classes can be different even on a perfectly balanced dataset using a simple
model. Empirically we demonstrate that similar intuitions hold on practical
networks and realistic datasets.
( 2
min )
In this paper, we interpret disentanglement as the discovery of local charts
and trace how that definition naturally leads to an equivalent condition for
disentanglement: the disentangled factors must commute with each other. We
discuss the practical and theoretical implications of commutativity, in
particular the compression and disentanglement of generative models. Finally,
we conclude with a discussion of related approaches to disentanglement and how
they relate to our view of disentanglement from the manifold perspective.
( 2
min )
We introduce a method for embedding graphs as vectors in a
structure-preserving manner, showcasing its rich representational capacity and
giving some theoretical properties. Our procedure falls under the bind-and-sum
approach, and we show that our binding operation - the tensor product - is the
most general binding operation that respects the principle of superposition. We
also establish some precise results characterizing the behavior of our method,
and we show that our use of spherical codes achieves a packing upper bound.
Then, we perform experiments showcasing our method's accuracy in various graph
operations even when the number of edges is quite large. Finally, we establish
a link to adjacency matrices, showing that our method is, in some sense, a
generalization of adjacency matrices with applications towards large sparse
graphs.
( 2
min )
In this paper, we study bottleneck identification in networks via extracting
minimax paths. Many real-world networks have stochastic weights for which full
knowledge is not available in advance. Therefore, we model this task as a
combinatorial semi-bandit problem to which we apply a combinatorial version of
Thompson Sampling and establish an upper bound on the corresponding Bayesian
regret. Due to the computational intractability of the problem, we then devise
an alternative problem formulation which approximates the original objective.
Finally, we experimentally evaluate the performance of Thompson Sampling with
the approximate formulation on real-world directed and undirected networks.
( 2
min )
Data pruning algorithms are commonly used to reduce the memory and
computational cost of the optimization process. Recent empirical results reveal
that random data pruning remains a strong baseline and outperforms most
existing data pruning methods in the high compression regime, i.e., where a
fraction of $30\%$ or less of the data is kept. This regime has recently
attracted a lot of interest as a result of the role of data pruning in
improving the so-called neural scaling laws; in [Sorscher et al.], the authors
showed the need for high-quality data pruning algorithms in order to beat the
sample power law.
In this work, we focus on score-based data pruning algorithms and show
theoretically and empirically why such algorithms fail in the high compression
regime. We demonstrate ``No Free Lunch" theorems for data pruning and present
calibration protocols that enhance the performance of existing pruning
algorithms in this high compression regime using randomization.
( 2
min )
Diffusion models achieve state-of-the-art performance in various generation
tasks. However, their theoretical foundations fall far behind. This paper
studies score approximation, estimation, and distribution recovery of diffusion
models, when data are supported on an unknown low-dimensional linear subspace.
Our result provides sample complexity bounds for distribution estimation using
diffusion models. We show that with a properly chosen neural network
architecture, the score function can be both accurately approximated and
efficiently estimated. Furthermore, the generated distribution based on the
estimated score function captures the data geometric structures and converges
to a close vicinity of the data distribution. The convergence rate depends on
the subspace dimension, indicating that diffusion models can circumvent the
curse of data ambient dimensionality.
( 2
min )
We propose new limiting dynamics for stochastic gradient descent in the small
learning rate regime called stochastic modified flows. These SDEs are driven by
a cylindrical Brownian motion and improve the so-called stochastic modified
equations by having regular diffusion coefficients and by matching the
multi-point statistics. As a second contribution, we introduce distribution
dependent stochastic modified flows which we prove to describe the fluctuating
limiting dynamics of stochastic gradient descent in the small learning rate -
infinite width scaling regime.
( 2
min )
Most works on the fairness of machine learning systems focus on the blind
optimization of common fairness metrics, such as Demographic Parity and
Equalized Odds. In this paper, we conduct a comparative study of several bias
mitigation approaches to investigate their behaviors at a fine grain, the
prediction level. Our objective is to characterize the differences between fair
models obtained with different approaches. With comparable performances in
fairness and accuracy, are the different bias mitigation approaches impacting a
similar number of individuals? Do they mitigate bias in a similar way? Do they
affect the same individuals when debiasing a model? Our findings show that bias
mitigation approaches differ a lot in their strategies, both in the number of
impacted individuals and the populations targeted. More surprisingly, we show
these results even apply for several runs of the same mitigation approach.
These findings raise questions about the limitations of the current group
fairness metrics, as well as the arbitrariness, hence unfairness, of the whole
debiasing process.
( 2
min )
We study the problem of discrete distribution estimation in KL divergence and
provide concentration bounds for the Laplace estimator. We show that the
deviation from mean scales as $\sqrt{k}/n$ when $n \ge k$, improving upon the
best prior result of $k/n$. We also establish a matching lower bound that shows
that our bounds are tight up to polylogarithmic factors.
( 2
min )
Machine-learned coarse-grained (CG) models have the potential for simulating
large molecular complexes beyond what is possible with atomistic molecular
dynamics. However, training accurate CG models remains a challenge. A widely
used methodology for learning CG force-fields maps forces from all-atom
molecular dynamics to the CG representation and matches them with a CG
force-field on average. We show that there is flexibility in how to map
all-atom forces to the CG representation, and that the most commonly used
mapping methods are statistically inefficient and potentially even incorrect in
the presence of constraints in the all-atom simulation. We define an
optimization statement for force mappings and demonstrate that substantially
improved CG force-fields can be learned from the same simulation data when
using optimized force maps. The method is demonstrated on the miniproteins
Chignolin and Tryptophan Cage and published as open-source code.
( 2
min )
Hyperbolic spaces have been quite popular in the recent past for representing
hierarchically organized data. Further, several classification algorithms for
data in these spaces have been proposed in the literature. These algorithms
mainly use either hyperplanes or geodesics for decision boundaries in a large
margin classifiers setting leading to a non-convex optimization problem. In
this paper, we propose a novel large margin classifier based on horocycle
(horosphere) decision boundaries that leads to a geodesically convex
optimization problem that can be optimized using any Riemannian gradient
descent technique guaranteeing a globally optimal solution. We present several
experiments depicting the performance of our classifier.
( 2
min )
submitted by /u/TrainingExtent8699
[link] [comments]
( 40
min )
submitted by /u/Calatravo
[link] [comments]
( 40
min )
Hello everyone. I am a software engineering assistant professor at a private university. I have got lots of older lecture videos on my channel.
I am using NVIDIA broadcast to remove noise and it works very well.
However, I want to improve audio quality as well.
After doing a lot of research I found that audio super-resolution is the way to go
The only github repo I have found so far not working
Any help is appreciated
How can I improve speech quality?
Here my example lecture video (noise removed already - reuploaded - but sound is not good)
C# Programming For Beginners - Lecture 2: Coding our First Application in .NET Core Console
https://youtu.be/XLsrsCCdSnU
submitted by /u/CeFurkan
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/cryfi
[link] [comments]
( 42
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 41
min )
submitted by /u/Tiege
[link] [comments]
( 40
min )
submitted by /u/Groudon466
[link] [comments]
( 40
min )
submitted by /u/arnolds112
[link] [comments]
( 40
min )
submitted by /u/Dalembert
[link] [comments]
( 41
min )
submitted by /u/TheMysteriousMrM
[link] [comments]
( 40
min )
submitted by /u/BackgroundResult
[link] [comments]
( 41
min )
submitted by /u/RushingRobotics_com
[link] [comments]
( 41
min )
submitted by /u/Risz1
[link] [comments]
( 41
min )
submitted by /u/alotmorealots
[link] [comments]
( 45
min )
submitted by /u/CeFurkan
[link] [comments]
( 41
min )
Hello everyone. I am a software engineering assistant professor at a private university. I have got lots of older lecture videos on my channel.
I am using NVIDIA broadcast to remove noise and it works very well.
However, I want to improve audio quality as well.
After doing a lot of research I found that audio super-resolution is the way to go
The only github repo I have found so far not working
Any help is appreciated
How can I improve speech quality?
Here my example lecture video (noise removed already - reuploaded - but sound is not good)
C# Programming For Beginners - Lecture 2: Coding our First Application in .NET Core Console
https://youtu.be/XLsrsCCdSnU
submitted by /u/CeFurkan
[link] [comments]
( 42
min )
Amazon SageMaker JumpStart is the machine learning (ML) hub of SageMaker that offers over 350 built-in algorithms, pre-trained models, and pre-built solution templates to help you get started with ML fast. JumpStart provides one-click access to a wide variety of pre-trained models for common ML tasks such as object detection, text classification, summarization, text generation […]
( 11
min )
submitted by /u/robotphilanthropist
[link] [comments]
( 41
min )
Here is a podcast episode with Noam Brown from Meta AI where we discuss his work on achieving human-level performance on poker and Diplomacy, as well as the power of spending compute at inference time!
submitted by /u/thejashGI
[link] [comments]
( 41
min )
AI-augmented applications, photorealistic rendering, simulation and other technologies are helping professionals achieve business-critical results from multi-app workflows faster than ever. Running these data-intensive, complex workflows, as well as sharing data and collaborating across geographically dispersed teams, requires workstations with high-end CPUs, GPUs and advanced networking. To help meet these demands, Intel and NVIDIA are powering Read article >
( 6
min )
Whether creating realistic digital humans that can express emotion or building immersive virtual worlds, 3D artists can reach new heights with NVIDIA Omniverse, a platform for creating and operating metaverse applications. A new Blender alpha release, now available in the Omniverse Launcher, lets users of the 3D graphics software optimize scenes and streamline workflows with Read article >
( 5
min )
Surfers, swimmers and beachgoers face a hidden danger in the ocean: rip currents. These narrow channels of water can flow away from the shore at speeds up to 2.5 meters per second, making them one of the biggest safety risks for those enjoying the ocean. To help keep beachgoers safe, Christo Rautenbach, a coastal and Read article >
( 4
min )
One of the primary goals in spectrum occupancy mapping is to create a system
that is robust to assumptions about the number of sensors, occupancy threshold
(in dBm), sensor noise, number of emitters and the propagation environment. We
show that such a system may be designed with neural networks using a process of
aggregation to allow a variable number of sensors during training and testing.
This process transforms the variable number of measurements into approximate
log-likelihood ratios (LLRs), which are fed as a fixed-resolution image into a
neural network. The use of LLR's provides robustness to the effects of noise
and occupancy threshold. In other words, a system may be trained for a nominal
number of sensors, threshold and noise levels, and still operate well at
various other levels without retraining. Our system operates without knowledge
of the number of emitters and does not explicitly attempt to estimate their
number or power. Receiver operating curves with realistic propagation
environments using topographic maps with commercial network design tools show
how performance of the neural network varies with the environment. The use of
very low-resolution sensors in this system can still yield good performance.
( 2
min )
Q-learning and SARSA with $\epsilon$-greedy exploration are leading
reinforcement learning methods. Their tabular forms converge to the optimal
Q-function under reasonable conditions. However, with function approximation,
these methods exhibit strange behaviors such as policy oscillation, chattering,
and convergence to different attractors (possibly even the worst policy) on
different runs, apart from the usual instability. A theory to explain these
phenomena has been a long-standing open problem, even for basic linear function
approximation (Sutton, 1999). Our work uses differential inclusion to provide
the first framework for resolving this problem. We also provide numerical
examples to illustrate our framework's prowess in explaining these algorithms'
behaviors.
( 2
min )
Approximating Stochastic Gradient Descent (SGD) as a Stochastic Differential
Equation (SDE) has allowed researchers to enjoy the benefits of studying a
continuous optimization trajectory while carefully preserving the stochasticity
of SGD. Analogous study of adaptive gradient methods, such as RMSprop and Adam,
has been challenging because there were no rigorously proven SDE approximations
for these methods. This paper derives the SDE approximations for RMSprop and
Adam, giving theoretical guarantees of their correctness as well as
experimental validation of their applicability to common large-scaling vision
and language settings. A key practical result is the derivation of a
$\textit{square root scaling rule}$ to adjust the optimization hyperparameters
of RMSprop and Adam when changing batch size, and its empirical validation in
deep learning settings.
( 2
min )
We provide a first finite-particle convergence rate for Stein variational
gradient descent (SVGD). Specifically, whenever the target distribution is
sub-Gaussian with a Lipschitz score, SVGD with n particles and an appropriate
step size sequence drives the kernel Stein discrepancy to zero at an order
1/sqrt(log log n) rate. We suspect that the dependence on n can be improved,
and we hope that our explicit, non-asymptotic proof strategy will serve as a
template for future refinements.
( 2
min )
Despite the impressive successes of deep learning approaches for various
chemical problems such as property prediction, virtual screening, and de novo
molecule design, separately designed models for specific tasks are usually
required, and it is often difficult to synergistically combine these models for
novel tasks. To address this, here we present a bidirectional molecular
foundation model that can be used for both molecular structure and property
inferences through a single model, inspired by recent multimodal learning
methods such as VLP. Furthermore, thanks to the outstanding structure/property
alignment in a common embedding space, experimental results confirm that our
method leads to state-of-the-art performance and interpretable attention maps
in both multimodal and unimodal tasks, including conditional molecule
generation, property prediction, molecule classification, and reaction
prediction.
( 2
min )
We survey a current, heated debate in the AI research community on whether
large pre-trained language models can be said to "understand" language -- and
the physical and social situations language encodes -- in any important sense.
We describe arguments that have been made for and against such understanding,
and key questions for the broader sciences of intelligence that have arisen in
light of these arguments. We contend that a new science of intelligence can be
developed that will provide insight into distinct modes of understanding, their
strengths and limitations, and the challenge of integrating diverse forms of
cognition.
( 2
min )
A formal write-up of the simple proof (1995) of the existence of calibrated
forecasts by the minimax theorem, which moreover shows that $N^3$ periods
suffice to guarantee a calibration error of at most $1/N$.
( 2
min )
We present ASR Bundestag, a dataset for automatic speech recognition in
German, consisting of 610 hours of aligned audio-transcript pairs for
supervised training as well as 1,038 hours of unlabeled audio snippets for
self-supervised learning, based on raw audio data and transcriptions from
plenary sessions and committee meetings of the German parliament. In addition,
we discuss utilized approaches for the automated creation of speech datasets
and assess the quality of the resulting dataset based on evaluations and
finetuning of a pre-trained state of the art model. We make the dataset
publicly available, including all subsets.
( 2
min )
We propose a new \textit{quadratic programming-based} method of approximating
a nonstandard density using a multivariate Gaussian density. Such nonstandard
densities usually arise while developing posterior samplers for unobserved
components models involving inequality constraints on the parameters. For
instance, Chan et al. (2016) provided a new model of trend inflation with
linear inequality constraints on the stochastic trend. We implemented the
proposed quadratic programming-based method for this model and compared it to
the existing approximation. We observed that the proposed method works as well
as the existing approximation in terms of the final trend estimates while
achieving gains in terms of sample efficiency.
( 2
min )
We develop a new approach to drifting games, a class of two-person games with
many applications to boosting and online learning settings. Our approach
involves (a) guessing an asymptotically optimal potential by solving an
associated partial differential equation (PDE); then (b) justifying the guess,
by proving upper and lower bounds on the final-time loss whose difference
scales like a negative power of the number of time steps. The proofs of our
potential-based upper bounds are elementary, using little more than Taylor
expansion. The proofs of our potential-based lower bounds are also elementary,
combining Taylor expansion with probabilistic or combinatorial arguments. Not
only is our approach more elementary, but we give new potentials and derive
corresponding upper and lower bounds that match each other in the asymptotic
regime.
( 2
min )
We study the convergence rate of discretized Riemannian Hamiltonian Monte
Carlo on sampling from distributions in the form of $e^{-f(x)}$ on a convex
body $\mathcal{M}\subset\mathbb{R}^{n}$. We show that for distributions in the
form of $e^{-\alpha^{\top}x}$ on a polytope with $m$ constraints, the
convergence rate of a family of commonly-used integrators is independent of
$\left\Vert \alpha\right\Vert _{2}$ and the geometry of the polytope. In
particular, the implicit midpoint method (IMM) and the generalized Leapfrog
method (LM) have a mixing time of $\widetilde{O}\left(mn^{3}\right)$ to achieve
$\epsilon$ total variation distance to the target distribution. These
guarantees are based on a general bound on the convergence rate for densities
of the form $e^{-f(x)}$ in terms of parameters of the manifold and the
integrator. Our theoretical guarantee complements the empirical results of
[KLSV22], which shows that RHMC with IMM can sample ill-conditioned, non-smooth
and constrained distributions in very high dimension efficiently in practice.
( 2
min )
The COVID-19 pandemic has significantly impacted the construction sector,
which is sensitive to economic cycles. In order to boost value and efficiency
in this sector, the use of innovative exploration technologies such as
ultrasonic and Artificial Intelligence techniques in building material research
is becoming increasingly crucial. In this study, we developed two models for
predicting the Los Angeles (LA) and Micro Deval (MDE) coefficients, two
important geotechnical tests used to determine the quality of rock aggregates.
These coefficients describe the resistance of aggregates to fragmentation and
abrasion. The ultrasound velocity, porosity, and density of the rocks were
determined and used as inputs to develop prediction models using multiple
regression and an artificial neural network. These models may be used to assess
the quality of rock aggregates at the exploration stage without the need for
tedious laboratory analysis.
( 2
min )
Despite all the benefits of automated hyperparameter optimization (HPO), most
modern HPO algorithms are black-boxes themselves. This makes it difficult to
understand the decision process which leads to the selected configuration,
reduces trust in HPO, and thus hinders its broad adoption. Here, we study the
combination of HPO with interpretable machine learning (IML) methods such as
partial dependence plots. These techniques are more and more used to explain
the marginal effect of hyperparameters on the black-box cost function or to
quantify the importance of hyperparameters. However, if such methods are
naively applied to the experimental data of the HPO process in a post-hoc
manner, the underlying sampling bias of the optimizer can distort
interpretations. We propose a modified HPO method which efficiently balances
the search for the global optimum w.r.t. predictive performance \emph{and} the
reliable estimation of IML explanations of an underlying black-box function by
coupling Bayesian optimization and Bayesian Algorithm Execution. On benchmark
cases of both synthetic objectives and HPO of a neural network, we demonstrate
that our method returns more reliable explanations of the underlying black-box
without a loss of optimization performance.
( 2
min )
This manuscript investigates the one-pass stochastic gradient descent (SGD)
dynamics of a two-layer neural network trained on Gaussian data and labels
generated by a similar, though not necessarily identical, target function. We
rigorously analyse the limiting dynamics via a deterministic and
low-dimensional description in terms of the sufficient statistics for the
population risk. Our unifying analysis bridges different regimes of interest,
such as the classical gradient-flow regime of vanishing learning rate, the
high-dimensional regime of large input dimension, and the overparameterised
"mean-field" regime of large network width, covering as well the intermediate
regimes where the limiting dynamics is determined by the interplay between
these behaviours. In particular, in the high-dimensional limit, the
infinite-width dynamics is found to remain close to a low-dimensional subspace
spanned by the target principal directions. Our results therefore provide a
unifying picture of the limiting SGD dynamics with synthetic data.
( 2
min )
This paper empirically studies commonly observed training difficulties of
Physics-Informed Neural Networks (PINNs) on dynamical systems. Our results
indicate that fixed points which are inherent to these systems play a key role
in the optimization of the in PINNs embedded physics loss function. We observe
that the loss landscape exhibits local optima that are shaped by the presence
of fixed points. We find that these local optima contribute to the complexity
of the physics loss optimization which can explain common training difficulties
and resulting nonphysical predictions. Under certain settings, e.g., initial
conditions close to fixed points or long simulations times, we show that those
optima can even become better than that of the desired solution.
( 2
min )
We describe a parametrized space for simple meta-reinforcement-learning
(meta-RL) tasks with arbitrary stimuli. The parametrization allows us to
randomly generate an arbitrary number of novel simple meta-learning tasks. The
space of meta-RL tasks covered by this parametrization includes many well-known
meta-RL tasks, such as bandit tasks, the Harlow task, T-mazes, the Daw two-step
task and others. Simple extensions allow it to capture tasks based on
two-dimensional topological spaces, such as find-the-spot or key-door tasks. We
describe a number of randomly generated meta-RL tasks and discuss potential
issues arising from random generation.
( 2
min )
Advances in neural modeling have achieved state-of-the-art (SOTA) results on
public natural language processing (NLP) benchmarks, at times surpassing human
performance. However, there is a gap between public benchmarks and real-world
applications where noise, such as typographical or grammatical mistakes, is
abundant and can result in degraded performance. Unfortunately, works which
evaluate the robustness of neural models on noisy data and propose
improvements, are limited to the English language. Upon analyzing noise in
different languages, we observe that noise types vary greatly across languages.
Thus, existing investigations do not generalize trivially to multilingual
settings. To benchmark the performance of pretrained multilingual language
models, we construct noisy datasets covering five languages and four NLP tasks
and observe a clear gap in the performance between clean and noisy data in the
zero-shot cross-lingual setting. After investigating several ways to boost the
robustness of multilingual models in this setting, we propose Robust
Contrastive Pretraining (RCP). RCP combines data augmentation with a
contrastive loss term at the pretraining stage and achieves large improvements
on noisy (and original test data) across two sentence-level (+3.2%) and two
sequence-labeling (+10 F1-score) multilingual classification tasks.
( 2
min )
As advertisers increasingly shift their budgets toward digital advertising,
forecasting advertising costs is essential for making budget plans to optimize
marketing campaign returns. In this paper, we perform a comprehensive study
using a variety of time-series forecasting methods to predict daily average
cost-per-click (CPC) in the online advertising market. We show that forecasting
advertising costs would benefit from multivariate models using covariates from
competitors' CPC development identified through time-series clustering. We
further interpret the results by analyzing feature importance and temporal
attention. Finally, we show that our approach has several advantages over
models that individual advertisers might build based solely on their collected
data.
( 2
min )
We motivate and introduce CHARD: Clinical Health-Aware Reasoning across
Dimensions, to investigate the capability of text generation models to act as
implicit clinical knowledge bases and generate free-flow textual explanations
about various health-related conditions across several dimensions. We collect
and present an associated dataset, CHARDat, consisting of explanations about 52
health conditions across three clinical dimensions. We conduct extensive
experiments using BART and T5 along with data augmentation, and perform
automatic, human, and qualitative analyses. We show that while our models can
perform decently, CHARD is very challenging with strong potential for further
exploration.
( 2
min )
In addition to the weights of synaptic shared connections, PNN includes
weights of synaptic effective ranges [14-24]. PNN considers synaptic strength
balance in dynamic of phagocytosing of synapses and static of constant sum of
synapses length [14], and includes the lead behavior of the school of fish.
Synapse formation will inhibit dendrites generation to a certain extent in
experiments and PNN simulations [15]. The memory persistence gradient of
retrograde circuit similar to the Enforcing Resilience in a Spring Boot. The
relatively good and inferior gradient information stored in memory engram cells
in synapse formation of retrograde circuit like the folds of the brain [16].
The controversy was claimed if human hippocampal neurogenesis persists
throughout aging, PNN considered it may have a new and longer circuit in late
iteration [17,18]. Closing the critical period will cause neurological disorder
in experiments and PNN simulations [19]. Considering both negative and positive
memories persistence help activate synapse length changes with iterations
better than only considering positive memory [20]. Astrocytic phagocytosis will
avoid the local accumulation of synapses by simulation, Lack of astrocytic
phagocytosis causes excitatory synapses and functionally impaired synapses
accumulate in experiments and lead to destruction of cognition, but local
longer synapses and worse results in PNN simulations [21]. It gives
relationship of intelligence and cortical thickness, individual differences in
brain [22]. The PNN also considered the memory engram cells that strengthened
Synaptic strength [23]. The effects of PNN's memory structure and tPBM may be
the same for powerful penetrability of signals [24]. Memory persistence also
inhibit local synaptic accumulation. By PNN, it may introduce the relatively
good and inferior solution in PSO. The simple PNN only has the synaptic
phagocytosis.
( 3
min )
Reinforcement learning is an effective way to solve the decision-making
problems. It is a meaningful and valuable direction to investigate autonomous
air combat maneuver decision-making method based on reinforcement learning.
However, when using reinforcement learning to solve the decision-making
problems with sparse rewards, such as air combat maneuver decision-making, it
costs too much time for training and the performance of the trained agent may
not be satisfactory. In order to solve these problems, the method based on
curriculum learning is proposed. First, three curricula of air combat maneuver
decision-making are designed: angle curriculum, distance curriculum and hybrid
curriculum. These courses are used to train air combat agents respectively, and
compared with the original method without any curriculum. The training results
show that angle curriculum can increase the speed and stability of training,
and improve the performance of the agent; distance curriculum can increase the
speed and stability of agent training; hybrid curriculum has a negative impact
on training, because it makes the agent get stuck at local optimum. The
simulation results show that after training, the agent can handle the
situations where targets come from different directions, and the maneuver
decision results are consistent with the characteristics of missile.
( 2
min )
Traffic signal control is safety-critical for our daily life. Roughly
one-quarter of road accidents in the U.S. happen at intersections due to
problematic signal timing, urging the development of safety-oriented
intersection control. However, existing studies on adaptive traffic signal
control using reinforcement learning technologies have focused mainly on
minimizing traffic delay but neglecting the potential exposure to unsafe
conditions. We, for the first time, incorporate road safety standards as
enforcement to ensure the safety of existing reinforcement learning methods,
aiming toward operating intersections with zero collisions. We have proposed a
safety-enhanced residual reinforcement learning method (SafeLight) and employed
multiple optimization techniques, such as multi-objective loss function and
reward shaping for better knowledge integration. Extensive experiments are
conducted using both synthetic and real-world benchmark datasets. Results show
that our method can significantly reduce collisions while increasing traffic
mobility.
( 2
min )
This work studies discrete diffusion probabilistic models with applications
to natural language generation. We derive an alternative yet equivalent
formulation of the sampling from discrete diffusion processes and leverage this
insight to develop a family of reparameterized discrete diffusion models. The
derived generic framework is highly flexible, offers a fresh perspective of the
generation process in discrete diffusion models, and features more effective
training and decoding techniques. We conduct extensive experiments to evaluate
the text generation capability of our model, demonstrating significant
improvements over existing diffusion models.
( 2
min )
We consider the problem of learning multioutput function classes in batch and
online settings. In both settings, we show that a multioutput function class is
learnable if and only if each single-output restriction of the function class
is learnable. This provides a complete characterization of the learnability of
multilabel classification and multioutput regression in both batch and online
settings. As an extension, we also consider multilabel learnability in the
bandit feedback setting and show a similar characterization as in the
full-feedback setting.
( 2
min )
In this paper, we extend the Wiener-Ito chaos decomposition to the class of
diffusion processes, whose drift and diffusion coefficient are of linear
growth. By omitting the orthogonality in the chaos expansion, we are able to
show that every $p$-integrable functional, for $p \in [1,\infty)$, can be
represented as sum of iterated integrals of the underlying process. Using a
truncated sum of this expansion and (possibly random) neural networks for the
integrands, whose parameters are learned in a machine learning setting, we show
that every financial derivative can be approximated arbitrarily well in the
$L^p$-sense. Since the hedging strategy of the approximating option can be
computed in closed form, we obtain an efficient algorithm that can replicate
any integrable financial derivative with short runtime.
( 2
min )
The tremendous growth in smart devices has uplifted several security threats.
One of the most prominent threats is malicious software also known as malware.
Malware has the capability of corrupting a device and collapsing an entire
network. Therefore, its early detection and mitigation are extremely important
to avoid catastrophic effects. In this work, we came up with a solution for
malware detection using state-of-the-art natural language processing (NLP)
techniques. Our main focus is to provide a lightweight yet effective classifier
for malware detection which can be used for heterogeneous devices, be it a
resource constraint device or a resourceful machine. Our proposed model is
tested on the benchmark data set with an accuracy and log loss score of 99.13
percent and 0.04 respectively.
( 2
min )
Motivated by neural network training in low-bit floating and fixed-point
environments, this work studies the convergence of variants of SGD with
computational error. Considering a general stochastic Lipschitz continuous loss
function, a novel convergence result to a Clarke stationary point is presented
assuming that only an approximation of its stochastic gradient can be computed
as well as error in computing the SGD step itself. Different variants of SGD
are then tested empirically in a variety of low-precision arithmetic
environments, where improved test set accuracy is observed compared to SGD for
two image recognition tasks.
( 2
min )
Gradient descent methods have long been the de facto standard for training
deep neural networks. Millions of training samples are fed into models with
billions of parameters, which are slowly updated over hundreds of epochs.
Recently, it's been shown that large, randomly initialized neural networks
contain subnetworks that perform as well as fully trained models. This insight
offers a promising avenue for training future neural networks by simply pruning
weights from large, random models. However, this problem is combinatorically
hard and classical algorithms are not efficient at finding the best subnetwork.
In this paper, we explore how quantum algorithms could be formulated and
applied to this neuron selection problem. We introduce several methods for
local quantum neuron selection that reduce the entanglement complexity that
large scale neuron selection would require, making this problem more tractable
for current quantum hardware.
( 2
min )
Text-based game environments are challenging because agents must deal with
long sequences of text, execute compositional actions using text and learn from
sparse rewards. We address these challenges by proposing Long-Context Language
Decision Transformers (LLDTs), a framework that is based on long transformer
language models and decision transformers (DTs). LLDTs extend DTs with 3
components: (1) exponential tilt to guide the agent towards high obtainable
goals, (2) novel goal conditioning methods yielding significantly better
results than the traditional return-to-go (sum of all future rewards), and (3)
a model of future observations. Our ablation results show that predicting
future observations improves agent performance. To the best of our knowledge,
LLDTs are the first to address offline RL with DTs on these challenging games.
Our experiments show that LLDTs achieve the highest scores among many different
types of agents on some of the most challenging Jericho games, such as
Enchanter.
( 2
min )
Graph Neural Networks (GNNs) have achieved much success on graph-structured
data. In light of this, there have been increasing interests in studying their
expressive power. One line of work studies the capability of GNNs to
approximate permutation-invariant functions on graphs, and another focuses on
the their power as tests for graph isomorphism. Our work connects these two
perspectives and proves their equivalence. We further develop a framework of
the expressive power of GNNs that incorporates both of these viewpoints using
the language of sigma-algebra, through which we compare the expressive power of
different types of GNNs together with other graph isomorphism tests. In
particular, we prove that the second-order Invariant Graph Network fails to
distinguish non-isomorphic regular graphs with the same degree. Then, we extend
it to a new architecture, Ring-GNN, which succeeds in distinguishing these
graphs and achieves good performances on real-world datasets.
( 2
min )
Recently, \cite{montasser2019vc} showed that finite VC dimension is not
sufficient for \textit{proper} adversarially robust PAC learning. In light of
this hardness result, there is a growing effort to study what type of
relaxations to the adversarially robust PAC learning setup can enable proper
learnability. In this work, we initiate the study of proper learning under
relaxations of the worst-case robust loss. We give a family of robust loss
relaxations under which VC classes are properly PAC learning with sample
complexity close to what one would require in the standard PAC learning setup.
On the other hand, we show that for an existing and natural relaxation of the
worst-case robust loss, finite VC dimension is not sufficient for proper
learning. Lastly, we give new generalization guarantees for the adversarially
robust empirical risk minimizer.
( 2
min )
We prove a convergence theorem for U-statistics of degree two, where the data
dimension $d$ is allowed to scale with sample size $n$. We find that the
limiting distribution of a U-statistic undergoes a phase transition from the
non-degenerate Gaussian limit to the degenerate limit, regardless of its
degeneracy and depending only on a moment ratio. A surprising consequence is
that a non-degenerate U-statistic in high dimensions can have a non-Gaussian
limit with a larger variance and asymmetric distribution. Our bounds are valid
for any finite $n$ and $d$, independent of individual eigenvalues of the
underlying function, and dimension-independent under a mild assumption. As an
application, we apply our theory to two popular kernel-based distribution
tests, MMD and KSD, whose high-dimensional performance has been challenging to
study. In a simple empirical setting, our results correctly predict how the
test power at a fixed threshold scales with $d$ and the bandwidth.
( 2
min )
Deep neural networks (DNN) have shown great capacity of modeling a dynamical
system; nevertheless, they usually do not obey physics constraints such as
conservation laws. This paper proposes a new learning framework named ConCerNet
to improve the trustworthiness of the DNN based dynamics modeling to endow the
invariant properties. ConCerNet consists of two steps: (i) a contrastive
learning method to automatically capture the system invariants (i.e.
conservation properties) along the trajectory observations; (ii) a neural
projection layer to guarantee that the learned dynamics models preserve the
learned invariants. We theoretically prove the functional relationship between
the learned latent representation and the unknown system invariant function.
Experiments show that our method consistently outperforms the baseline neural
networks in both coordinate error and conservation metrics by a large margin.
With neural network based parameterization and no dependence on prior
knowledge, our method can be extended to complex and large-scale dynamics by
leveraging an autoencoder.
( 2
min )
In this work, we consider the stochastic optimal control problem in
continuous time and a policy gradient method to solve it. In particular, we
study the gradient flow for the control, viewed as a continuous time limit of
the policy gradient. We prove the global convergence of the gradient flow and
establish a convergence rate under some regularity assumptions. The main
novelty in the analysis is the notion of local optimal control function, which
is introduced to compare the local optimality of the iterate.
( 2
min )
Human-robot interaction (HRI) research is progressively addressing
multi-party scenarios, where a robot interacts with more than one human user at
the same time. Conversely, research is still at an early stage for human-robot
collaboration (HRC). The use of machine learning techniques to handle such type
of collaboration requires data that are less feasible to produce than in a
typical HRC setup. This work outlines concepts of design of concurrent tasks
for non-dyadic HRC applications. Based upon these concepts, this study also
proposes an alternative way of gathering data regarding multiuser activity, by
collecting data related to single subjects and merging them in post-processing,
to reduce the effort involved in producing recordings of pair settings. To
validate this statement, 3D skeleton poses of activity of single subjects were
collected and merged in pairs. After this, the datapoints were used to
separately train a long short-term memory (LSTM) network and a variational
autoencoder (VAE) composed of spatio-temporal graph convolutional networks
(STGCN) to recognise the joint activities of the pairs of people. The results
showed that it is possible to make use of data collected in this way for pair
HRC settings and get similar performances compared to using data regarding
groups of users recorded under the same settings, relieving from the technical
difficulties involved in producing these data.
( 2
min )
In a recent paper, Ling et al. investigated the over-parametrized Deep
Equilibrium Model (DEQ) with ReLU activation and proved that the gradient
descent converges to a globally optimal solution at a linear convergence rate
for the quadratic loss function. In this paper, we show that this fact still
holds for DEQs with any general activation which has bounded first and second
derivatives. Since the new activation function is generally non-linear, a
general population Gram matrix is designed, and a new form of dual activation
with Hermite polynomial expansion is developed.
( 2
min )
Non-intrusive load monitoring (NILM) aims to decompose aggregated electrical
usage signal into appliance-specific power consumption and it amounts to a
classical example of blind source separation tasks. Leveraging recent progress
on deep learning techniques, we design a new neural NILM model Multi-State Dual
CNN (MSDC). Different from previous models, MSDC explicitly extracts
information about the appliance's multiple states and state transitions, which
in turn regulates the prediction of signals for appliances. More specifically,
we employ a dual-CNN architecture: one CNN for outputting state distributions
and the other for predicting the power of each state. A new technique is
invented that utilizes conditional random fields (CRF) to capture state
transitions. Experiments on two real-world datasets REDD and UK-DALE
demonstrate that our model significantly outperform state-of-the-art models
while having good generalization capacity, achieving 6%-10% MAE gain and
33%-51% SAE gain to unseen appliances.
( 2
min )
With the increased usage of artificial intelligence (AI), it is imperative to
understand how these models work internally. These needs have led to the
development of a new field called eXplainable artificial intelligence (XAI).
This field consists of on a set of techniques that allows us to theoretically
determine the cause of the AI decisions. One unsolved question about XAI is how
to measure the quality of explanations. In this study, we propose a new method
to generate datasets with ground truth (GT). These datasets allow us to measure
how faithful is a method without ad hoc solutions. We conducted a set of
experiments that compared our GT with real model explanations and obtained
excellent results confirming that our proposed method is correct.
( 2
min )
Engineering more secure software has become a critical challenge in the cyber
world. It is very important to develop methodologies, techniques, and tools for
developing secure software. To develop secure software, software developers
need to think like an attacker through mining software repositories. These aim
to analyze and understand the data repositories related to software
development. The main goal is to use these software repositories to support the
decision-making process of software development. There are different
vulnerability databases like Common Weakness Enumeration (CWE), Common
Vulnerabilities and Exposures database (CVE), and CAPEC. We utilized a database
called MITRE. MITRE ATT&CK tactics and techniques have been used in various
ways and methods, but tools for utilizing these tactics and techniques in the
early stages of the software development life cycle (SDLC) are lacking. In this
paper, we use machine learning algorithms to map requirements to the MITRE
ATT&CK database and determine the accuracy of each mapping depending on the
data split.
( 2
min )
Studies have shown that large pretrained language models exhibit biases
against social groups based on race, gender etc, which they inherit from the
datasets they are trained on. Various researchers have proposed mathematical
tools for quantifying and identifying these biases. There have been methods
proposed to mitigate such biases. In this paper, we present a comprehensive
quantitative evaluation of different kinds of biases such as race, gender,
ethnicity, age etc. exhibited by popular pretrained language models such as
BERT, GPT-2 etc. and also present a toolkit that provides plug-and-play
interfaces to connect mathematical tools to identify biases with large
pretrained language models such as BERT, GPT-2 etc. and also present users with
the opportunity to test custom models against these metrics. The toolkit also
allows users to debias existing and custom models using the debiasing
techniques proposed so far. The toolkit is available at
https://github.com/HrishikeshVish/Fairpy.
( 2
min )
Recent advances in instruction-following large language models (LLMs) have
led to dramatic improvements in a range of NLP tasks. Unfortunately, we find
that the same improved capabilities amplify the dual-use risks for malicious
purposes of these models. Dual-use is difficult to prevent as
instruction-following capabilities now enable standard attacks from computer
security. The capabilities of these instruction-following LLMs provide strong
economic incentives for dual-use by malicious actors. In particular, we show
that instruction-following LLMs can produce targeted malicious content,
including hate speech and scams, bypassing in-the-wild defenses implemented by
LLM API vendors. Our analysis shows that this content can be generated
economically and at cost likely lower than with human effort alone. Together,
our findings suggest that LLMs will increasingly attract more sophisticated
adversaries and attacks, and addressing these attacks may require new
approaches to mitigations.
( 2
min )
Modern NLP systems exhibit a range of biases, which a growing literature on
model debiasing attempts to correct. However current progress is hampered by a
plurality of definitions of bias, means of quantification, and oftentimes vague
relation between debiasing algorithms and theoretical measures of bias. This
paper seeks to clarify the current situation and plot a course for meaningful
progress in fair learning, with two key contributions: (1) making clear
inter-relations among the current gamut of methods, and their relation to
fairness theory; and (2) addressing the practical problem of model selection,
which involves a trade-off between fairness and accuracy and has led to
systemic issues in fairness research. Putting them together, we make several
recommendations to help shape future work.
( 2
min )
In recent days, the number of technology enthusiasts is increasing day by day
with the prevalence of technological products and easy access to the internet.
Similarly, the amount of people working behind this rapid development is rising
tremendously. Computer programmers consist of a large portion of those
tech-savvy people. Codeforces, an online programming and contest hosting
platform used by many competitive programmers worldwide. It is regarded as one
of the most standardized platforms for practicing programming problems and
participate in programming contests. In this research, we propose a framework
that predicts the performance of any particular contestant in the upcoming
competitions as well as predicts the rating after that contest based on their
practice and the performance of their previous contests.
( 2
min )
We present a novel momentum-based first order optimization method (AGNES)
which provably achieves acceleration for convex minimization, even if the
stochastic noise in the gradient estimates is many orders of magnitude larger
than the gradient itself. Here we model the noise as having a variance which is
proportional to the magnitude of the underlying gradient. We argue, based upon
empirical evidence, that this is appropriate for mini-batch gradients in
overparameterized deep learning. Furthermore, we demonstrate that the method
achieves competitive performance in the training of CNNs on MNIST and CIFAR-10.
( 2
min )
We consider the sequential decision-making problem where the mean outcome is
a non-linear function of the chosen action. Compared with the linear model, two
curious phenomena arise in non-linear models: first, in addition to the
"learning phase" with a standard parametric rate for estimation or regret,
there is an "burn-in period" with a fixed cost determined by the non-linear
function; second, achieving the smallest burn-in cost requires new exploration
algorithms. For a special family of non-linear functions named ridge functions
in the literature, we derive upper and lower bounds on the optimal burn-in
cost, and in addition, on the entire learning trajectory during the burn-in
period via differential equations. In particular, a two-stage algorithm that
first finds a good initial action and then treats the problem as locally linear
is statistically optimal. In contrast, several classical algorithms, such as
UCB and algorithms relying on regression oracles, are provably suboptimal.
( 2
min )
In this paper, we extend the Wiener-Ito chaos decomposition to the class of
diffusion processes, whose drift and diffusion coefficient are of linear
growth. By omitting the orthogonality in the chaos expansion, we are able to
show that every $p$-integrable functional, for $p \in [1,\infty)$, can be
represented as sum of iterated integrals of the underlying process. Using a
truncated sum of this expansion and (possibly random) neural networks for the
integrands, whose parameters are learned in a machine learning setting, we show
that every financial derivative can be approximated arbitrarily well in the
$L^p$-sense. Since the hedging strategy of the approximating option can be
computed in closed form, we obtain an efficient algorithm that can replicate
any integrable financial derivative with short runtime.
( 2
min )
We establish a dataset of over $1.6\times10^4$ experimental images of
Bose--Einstein condensates containing solitonic excitations to enable machine
learning (ML) for many-body physics research. About $33~\%$ of this dataset has
manually assigned and carefully curated labels. The remainder is automatically
labeled using SolDet -- an implementation of a physics-informed ML data
analysis framework -- consisting of a convolutional-neural-network-based
classifier and OD as well as a statistically motivated physics-informed
classifier and a quality metric. This technical note constitutes the definitive
reference of the dataset, providing an opportunity for the data science
community to develop more sophisticated analysis tools, to further understand
nonlinear many-body physics, and even advance cold atom experiments.
( 2
min )
We provide a first finite-particle convergence rate for Stein variational
gradient descent (SVGD). Specifically, whenever the target distribution is
sub-Gaussian with a Lipschitz score, SVGD with n particles and an appropriate
step size sequence drives the kernel Stein discrepancy to zero at an order
1/sqrt(log log n) rate. We suspect that the dependence on n can be improved,
and we hope that our explicit, non-asymptotic proof strategy will serve as a
template for future refinements.
( 2
min )
We consider the problem of learning multioutput function classes in batch and
online settings. In both settings, we show that a multioutput function class is
learnable if and only if each single-output restriction of the function class
is learnable. This provides a complete characterization of the learnability of
multilabel classification and multioutput regression in both batch and online
settings. As an extension, we also consider multilabel learnability in the
bandit feedback setting and show a similar characterization as in the
full-feedback setting.
( 2
min )
Despite all the benefits of automated hyperparameter optimization (HPO), most
modern HPO algorithms are black-boxes themselves. This makes it difficult to
understand the decision process which leads to the selected configuration,
reduces trust in HPO, and thus hinders its broad adoption. Here, we study the
combination of HPO with interpretable machine learning (IML) methods such as
partial dependence plots. These techniques are more and more used to explain
the marginal effect of hyperparameters on the black-box cost function or to
quantify the importance of hyperparameters. However, if such methods are
naively applied to the experimental data of the HPO process in a post-hoc
manner, the underlying sampling bias of the optimizer can distort
interpretations. We propose a modified HPO method which efficiently balances
the search for the global optimum w.r.t. predictive performance \emph{and} the
reliable estimation of IML explanations of an underlying black-box function by
coupling Bayesian optimization and Bayesian Algorithm Execution. On benchmark
cases of both synthetic objectives and HPO of a neural network, we demonstrate
that our method returns more reliable explanations of the underlying black-box
without a loss of optimization performance.
( 2
min )
Graph Neural Networks (GNNs) have achieved much success on graph-structured
data. In light of this, there have been increasing interests in studying their
expressive power. One line of work studies the capability of GNNs to
approximate permutation-invariant functions on graphs, and another focuses on
the their power as tests for graph isomorphism. Our work connects these two
perspectives and proves their equivalence. We further develop a framework of
the expressive power of GNNs that incorporates both of these viewpoints using
the language of sigma-algebra, through which we compare the expressive power of
different types of GNNs together with other graph isomorphism tests. In
particular, we prove that the second-order Invariant Graph Network fails to
distinguish non-isomorphic regular graphs with the same degree. Then, we extend
it to a new architecture, Ring-GNN, which succeeds in distinguishing these
graphs and achieves good performances on real-world datasets.
( 2
min )
We present a continuous-time probabilistic approach for estimating the chirp
signal and its instantaneous frequency function when the true forms of these
functions are not accessible. Our model represents these functions by
non-linearly cascaded Gaussian processes represented as non-linear stochastic
differential equations. The posterior distribution of the functions is then
estimated with stochastic filters and smoothers. We compute a (posterior)
Cram\'er--Rao lower bound for the Gaussian process model, and derive a
theoretical upper bound for the estimation error in the mean squared sense. The
experiments show that the proposed method outperforms a number of
state-of-the-art methods on a synthetic data. We also show that the method
works out-of-the-box for two real-world datasets.
( 2
min )
In addition to the weights of synaptic shared connections, PNN includes
weights of synaptic effective ranges [14-24]. PNN considers synaptic strength
balance in dynamic of phagocytosing of synapses and static of constant sum of
synapses length [14], and includes the lead behavior of the school of fish.
Synapse formation will inhibit dendrites generation to a certain extent in
experiments and PNN simulations [15]. The memory persistence gradient of
retrograde circuit similar to the Enforcing Resilience in a Spring Boot. The
relatively good and inferior gradient information stored in memory engram cells
in synapse formation of retrograde circuit like the folds of the brain [16].
The controversy was claimed if human hippocampal neurogenesis persists
throughout aging, PNN considered it may have a new and longer circuit in late
iteration [17,18]. Closing the critical period will cause neurological disorder
in experiments and PNN simulations [19]. Considering both negative and positive
memories persistence help activate synapse length changes with iterations
better than only considering positive memory [20]. Astrocytic phagocytosis will
avoid the local accumulation of synapses by simulation, Lack of astrocytic
phagocytosis causes excitatory synapses and functionally impaired synapses
accumulate in experiments and lead to destruction of cognition, but local
longer synapses and worse results in PNN simulations [21]. It gives
relationship of intelligence and cortical thickness, individual differences in
brain [22]. The PNN also considered the memory engram cells that strengthened
Synaptic strength [23]. The effects of PNN's memory structure and tPBM may be
the same for powerful penetrability of signals [24]. Memory persistence also
inhibit local synaptic accumulation. By PNN, it may introduce the relatively
good and inferior solution in PSO. The simple PNN only has the synaptic
phagocytosis.
( 3
min )
Recently, \cite{montasser2019vc} showed that finite VC dimension is not
sufficient for \textit{proper} adversarially robust PAC learning. In light of
this hardness result, there is a growing effort to study what type of
relaxations to the adversarially robust PAC learning setup can enable proper
learnability. In this work, we initiate the study of proper learning under
relaxations of the worst-case robust loss. We give a family of robust loss
relaxations under which VC classes are properly PAC learning with sample
complexity close to what one would require in the standard PAC learning setup.
On the other hand, we show that for an existing and natural relaxation of the
worst-case robust loss, finite VC dimension is not sufficient for proper
learning. Lastly, we give new generalization guarantees for the adversarially
robust empirical risk minimizer.
( 2
min )
A formal write-up of the simple proof (1995) of the existence of calibrated
forecasts by the minimax theorem, which moreover shows that $N^3$ periods
suffice to guarantee a calibration error of at most $1/N$.
( 2
min )
The limit of infinite width allows for substantial simplifications in the
analytical study of over-parameterised neural networks. With a suitable random
initialisation, an extremely large network exhibits an approximately Gaussian
behaviour. In the present work, we establish a similar result for a simple
stochastic architecture whose parameters are random variables, holding both
before and during training. The explicit evaluation of the output distribution
allows for a PAC-Bayesian training procedure that directly optimises the
generalisation bound. For a large but finite-width network, we show empirically
on MNIST that this training approach can outperform standard PAC-Bayesian
methods.
( 2
min )
Recently there is a rising interest in the research of mean field
optimization, in particular because of its role in analyzing the training of
neural networks. In this paper by adding the Fisher Information as the
regularizer, we relate the regularized mean field optimization problem to a
so-called mean field Schrodinger dynamics. We develop an energy-dissipation
method to show that the marginal distributions of the mean field Schrodinger
dynamics converge exponentially quickly towards the unique minimizer of the
regularized optimization problem. Remarkably, the mean field Schrodinger
dynamics is proved to be a gradient flow on the probability measure space with
respect to the relative entropy. Finally we propose a Monte Carlo method to
sample the marginal distributions of the mean field Schrodinger dynamics.
( 2
min )
We consider the task of representing signals supported on graph bundles,
which are generalizations of product graphs that allow for "twists" in the
product structure. Leveraging the localized product structure of a graph
bundle, we demonstrate how a suitable partition of unity over the base graph
can be used to lift the signal on the graph into a space where a product
factorization can be readily applied. Motivated by the locality of this
procedure, we demonstrate that bases for the signal spaces of the components of
the graph bundle can be lifted in the same way, yielding a basis for the signal
space of the total graph. We demonstrate this construction on synthetic graphs,
as well as with an analysis of the energy landscape of conformational manifolds
in stereochemistry.
( 2
min )
This manuscript investigates the one-pass stochastic gradient descent (SGD)
dynamics of a two-layer neural network trained on Gaussian data and labels
generated by a similar, though not necessarily identical, target function. We
rigorously analyse the limiting dynamics via a deterministic and
low-dimensional description in terms of the sufficient statistics for the
population risk. Our unifying analysis bridges different regimes of interest,
such as the classical gradient-flow regime of vanishing learning rate, the
high-dimensional regime of large input dimension, and the overparameterised
"mean-field" regime of large network width, covering as well the intermediate
regimes where the limiting dynamics is determined by the interplay between
these behaviours. In particular, in the high-dimensional limit, the
infinite-width dynamics is found to remain close to a low-dimensional subspace
spanned by the target principal directions. Our results therefore provide a
unifying picture of the limiting SGD dynamics with synthetic data.
( 2
min )
We present a novel momentum-based first order optimization method (AGNES)
which provably achieves acceleration for convex minimization, even if the
stochastic noise in the gradient estimates is many orders of magnitude larger
than the gradient itself. Here we model the noise as having a variance which is
proportional to the magnitude of the underlying gradient. We argue, based upon
empirical evidence, that this is appropriate for mini-batch gradients in
overparameterized deep learning. Furthermore, we demonstrate that the method
achieves competitive performance in the training of CNNs on MNIST and CIFAR-10.
( 2
min )
The Distributional Random Forest (DRF) is a recently introduced Random Forest
algorithm to estimate multivariate conditional distributions. Due to its
general estimation procedure, it can be employed to estimate a wide range of
targets such as conditional average treatment effects, conditional quantiles,
and conditional correlations. However, only results about the consistency and
convergence rate of the DRF prediction are available so far. We characterize
the asymptotic distribution of DRF and develop a bootstrap approximation of it.
This allows us to derive inferential tools for quantifying standard errors and
the construction of confidence regions that have asymptotic coverage
guarantees. In simulation studies, we empirically validate the developed theory
for inference of low-dimensional targets and for testing distributional
differences between two populations.
( 2
min )
We prove a convergence theorem for U-statistics of degree two, where the data
dimension $d$ is allowed to scale with sample size $n$. We find that the
limiting distribution of a U-statistic undergoes a phase transition from the
non-degenerate Gaussian limit to the degenerate limit, regardless of its
degeneracy and depending only on a moment ratio. A surprising consequence is
that a non-degenerate U-statistic in high dimensions can have a non-Gaussian
limit with a larger variance and asymmetric distribution. Our bounds are valid
for any finite $n$ and $d$, independent of individual eigenvalues of the
underlying function, and dimension-independent under a mild assumption. As an
application, we apply our theory to two popular kernel-based distribution
tests, MMD and KSD, whose high-dimensional performance has been challenging to
study. In a simple empirical setting, our results correctly predict how the
test power at a fixed threshold scales with $d$ and the bandwidth.
( 2
min )
Electronic health records (EHR) often contain sensitive medical information
about individual patients, posing significant limitations to sharing or
releasing EHR data for downstream learning and inferential tasks. We use
normalizing flows (NF), a family of deep generative models, to estimate the
probability density of a dataset with differential privacy (DP) guarantees,
from which privacy-preserving synthetic data are generated. We apply the
technique to an EHR dataset containing patients with pulmonary hypertension. We
assess the learning and inferential utility of the synthetic data by comparing
the accuracy in the prediction of the hypertension status and variational
posterior distribution of the parameters of a physics-based model. In addition,
we use a simulated dataset from a nonlinear model to compare the results from
variational inference (VI) based on privacy-preserving synthetic data, and
privacy-preserving VI obtained from directly privatizing NFs for VI with DP
guarantees given the original non-private dataset. The results suggest that
synthetic data generated through differentially private density estimation with
NF can yield good utility at a reasonable privacy cost. We also show that VI
obtained from differentially private NF based on the free energy bound loss may
produce variational approximations with significantly altered correlation
structure, and loss formulations based on alternative dissimilarity metrics
between two distributions might provide improved results.
( 2
min )
In a recent paper, Ling et al. investigated the over-parametrized Deep
Equilibrium Model (DEQ) with ReLU activation and proved that the gradient
descent converges to a globally optimal solution at a linear convergence rate
for the quadratic loss function. In this paper, we show that this fact still
holds for DEQs with any general activation which has bounded first and second
derivatives. Since the new activation function is generally non-linear, a
general population Gram matrix is designed, and a new form of dual activation
with Hermite polynomial expansion is developed.
( 2
min )
We propose a new bound for generalization of neural networks using Koopman
operators. Unlike most of the existing works, we focus on the role of the final
nonlinear transformation of the networks. Our bound is described by the
reciprocal of the determinant of the weight matrices and is tighter than
existing norm-based bounds when the weight matrices do not have small singular
values. According to existing theories about the low-rankness of the weight
matrices, it may be counter-intuitive that we focus on the case where singular
values of weight matrices are not small. However, motivated by the final
nonlinear transformation, we can see that our result sheds light on a new
perspective regarding a noise filtering property of neural networks. Since our
bound comes from Koopman operators, this work also provides a connection
between operator-theoretic analysis and generalization of neural networks.
Numerical results support the validity of our theoretical results.
( 2
min )
Here is a podcast episode with Noam Brown from Meta AI where we discuss his work on achieving human-level performance on poker and Diplomacy, as well as the power of spending compute at inference time!
submitted by /u/thejashGI
[link] [comments]
( 42
min )
I'm glad to share with you our Open Access survey paper about image super-resolution:
https://ieeexplore.ieee.org/abstract/document/10041995
The goal of this work is to give an overview of the abundance of publications in image super-resolution, give an introduction for new researchers, and open thriving discussions as well as point to potential future directions to advance the field :)
submitted by /u/Maleficent_Stay_7737
[link] [comments]
( 43
min )
Here is a podcast episode with Noam Brown from Meta AI where we discuss his work on achieving human-level performance on poker and Diplomacy, as well as the power of spending compute at inference time!
submitted by /u/thejashGI
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/PuppetHere
[link] [comments]
( 40
min )
submitted by /u/ssigea
[link] [comments]
( 43
min )
submitted by /u/Ranwell13
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/spacesluts
[link] [comments]
( 40
min )
submitted by /u/Dalembert
[link] [comments]
( 43
min )
submitted by /u/JimZerChapirov
[link] [comments]
( 41
min )
submitted by /u/Calatravo
[link] [comments]
( 40
min )
submitted by /u/Impressive_Hat9961
[link] [comments]
( 40
min )
Amazon Kendra is an intelligent search service powered by machine learning (ML). It indexes the documents stored in a wide range of repositories and finds the most relevant document based on the keywords or natural language questions the user has searched for. In some scenarios, you need the search results to be filtered based on […]
( 12
min )
We’re excited to announce that Amazon Personalize now lets you measure how your personalized recommendations can help you achieve your business goals. After specifying the metrics that you want to track, you can identify which campaigns and recommenders are most impactful and understand the impact of recommendations on your business metrics. All customers want to […]
( 10
min )
Love and creativity are in the air this Valentine’s Day In the NVIDIA Studio, as 3D artist Molly Brady presents a parody scene inspired by the iconic The Birth of Venus (Redux) painting by Sando Botticelli.
( 7
min )
submitted by /u/Piano-Nerd
[link] [comments]
( 40
min )
submitted by /u/trcytony
[link] [comments]
( 40
min )
submitted by /u/No-Factor2579
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Chisom1998_
[link] [comments]
( 40
min )
submitted by /u/ChaosMindsDev
[link] [comments]
( 40
min )
submitted by /u/TheInsaneApp
[link] [comments]
( 40
min )
submitted by /u/tipani86
[link] [comments]
( 40
min )
submitted by /u/henshinger
[link] [comments]
( 41
min )
submitted by /u/Phishstixxx
[link] [comments]
( 40
min )
submitted by /u/r4pturesan
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/chronck
[link] [comments]
( 40
min )
submitted by /u/Alarming-Recipe2857
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/tysam_and_co
[link] [comments]
( 47
min )
How a single SYCL codebase makes it possible to run multi-devices such as Intel GPUs, AMD GPUs, and NVIDIA GPUs Posted on behalf of Arti Gupta, Intel oneAPI Program Director The ever-growing scale and speed of High-Performance Computing (HPC) systems unleash many new opportunities for researchers and data scientists. Today, the first exascale-capable HPC systems,… Read More »Advancing HPC and AI through oneAPI Heterogeneous Programming in Academia and Research
The post Advancing HPC and AI through oneAPI Heterogeneous Programming in Academia and Research appeared first on Data Science Central.
( 20
min )
The world is going digital at a very fast speed. From retail shops to the cab industry to banking, all are changing and so is the healthcare industry. We can see a huge difference in the industry in terms of technology compared to ten years back. But there is a long way to go for… Read More »Top Healthcare App Development Trends That Will Dominate 2023
The post Top Healthcare App Development Trends That Will Dominate 2023 appeared first on Data Science Central.
( 22
min )
There’s no denying that we live in an app-driven world, and that’s especially true for modern businesses. Organizations use apps for almost everything. While this allows for faster communication, it can also lead to application fragmentation. App fragmentation is when an organization uses multiple applications to perform similar tasks. This creates an inefficient and disjointed… Read More »App Fragmentation & How To Avoid Siloed Communication: 3 Right Technologies for The Job
The post App Fragmentation & How To Avoid Siloed Communication: 3 Right Technologies for The Job appeared first on Data Science Central.
( 22
min )
So I just uploaded a devlog out about my bullet-dodging AI game. I discuss how I trained a Reinforcement Learning agent to learn to dodge bullets using Unity's ML Agents package! The goal of the next devlog is to extend this to a 2 player setting, where a human player competes against a trained AI player to dodge/shoot bullets! I will probably be doing some MARL with self-play to achieve this, but this video is a single-agent setting.
I'm a baby Youtuber, so I appreciate yall for checking it out!
https://youtu.be/l9geEcn-A6Q
submitted by /u/AvvYaa
[link] [comments]
( 41
min )
This post is co-written by Zdenko Estok, Cloud Architect at Accenture and Sakar Selimcan, DeepRacer SME at Accenture. With the increasing use of artificial intelligence (AI) and machine learning (ML) for a vast majority of industries (ranging from healthcare to insurance, from manufacturing to marketing), the primary focus shifts to efficiency when building and training […]
( 8
min )
The method enables a model to determine its confidence in a prediction, while using no additional data and far fewer computing resources than other methods.
( 9
min )
submitted by /u/radi-cho
[link] [comments]
( 42
min )
submitted by /u/radi-cho
[link] [comments]
( 42
min )
submitted by /u/radi-cho
[link] [comments]
( 42
min )
submitted by /u/Thebombdiggityy
[link] [comments]
( 42
min )
submitted by /u/Wiskkey
[link] [comments]
( 43
min )
submitted by /u/t0ns0fph0t0ns
[link] [comments]
( 44
min )
submitted by /u/helliun
[link] [comments]
( 45
min )
Using available off-the-shelf AI services, I ended up making this video. I walk through the process and discuss some implications.
Here is the process that I followed
Asked ChatGPT to create a script
Asked a text-to-speech generative AI to convert the script into an audio
Asked MidJourney to create an Avatar of a narrator
Ask audio-to-video generative AI to generate video from the avatar and audio.
https://ithinkbot.com/make-end-to-end-video-using-generative-ai-totally-free-try-it-out-dadee18302de
submitted by /u/Opitmus_Prime
[link] [comments]
( 41
min )
submitted by /u/ProglabHelper
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Thebombdiggityy
[link] [comments]
( 40
min )
submitted by /u/XyBr_ez
[link] [comments]
( 40
min )
submitted by /u/Peter3tv33
[link] [comments]
( 41
min )
submitted by /u/joeyjojo6161
[link] [comments]
( 40
min )
Hi guys,
I have made a video on YouTube here where I explain how we can measure the fairness of a machine learning model by using the disparate impact score.
I hope it may be of use to some of you out there. As always, feedback is more than welcomed! :)
submitted by /u/Personal-Trainer-541
[link] [comments]
( 41
min )
submitted by /u/YungMixtape2004
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Number_5_alive
[link] [comments]
( 40
min )
Hi, I have just come across AI course provided by OpenCV. It has a lot about computer vision stuff. But it costs $1599, anyone is taking it? any comment? Should I bet on this for a career change?
P.S. I have some basic programming knowledge and engineering background.
Here is the link to their course page.
https://opencv.org/courses/
submitted by /u/sumofjack
[link] [comments]
( 41
min )
submitted by /u/ssigea
[link] [comments]
( 46
min )
submitted by /u/SpatialComputing
[link] [comments]
( 41
min )
submitted by /u/karrnawhore
[link] [comments]
( 41
min )
Hi guys,
I have made a video on YouTube here where I explain how we can measure the fairness of a machine learning model by using the disparate impact score.
I hope it may be of use to some of you out there. As always, feedback is more than welcomed! :)
submitted by /u/Personal-Trainer-541
[link] [comments]
( 41
min )
submitted by /u/Lakshmireddys
[link] [comments]
( 40
min )
submitted by /u/karrnawhore
[link] [comments]
( 40
min )
If you want any more proof about how much AI has integrated itself into our daily lives, go no further than the map on your smart phone. Whether you use Google Maps or Apple Maps or Waze (also owned by Google), these AI-infused apps are amazing at getting you from Point A to Point B… Read More »AI Effectiveness Starts by Understanding User Intent
The post AI Effectiveness Starts by Understanding User Intent appeared first on Data Science Central.
( 22
min )
Incident management for cloud services is a complex process involving several
steps and has a huge impact on both service health and developer productivity.
On-call engineers require significant amount of domain knowledge and manual
effort for root causing and mitigation of production incidents. Recent advances
in artificial intelligence has resulted in state-of-the-art large language
models like GPT-3.x (both GPT-3.0 and GPT-3.5), which have been used to solve a
variety of problems ranging from question answering to text summarization. In
this work, we do the first large-scale study to evaluate the effectiveness of
these models for helping engineers root cause and mitigate production
incidents. We do a rigorous study at Microsoft, on more than 40,000 incidents
and compare several large language models in zero-shot, fine-tuned and
multi-task setting using semantic and lexical metrics. Lastly, our human
evaluation with actual incident owners show the efficacy and future potential
of using artificial intelligence for resolving cloud incidents.
( 2
min )
submitted by /u/karrnawhore
[link] [comments]
( 42
min )
submitted by /u/TheRealBrisky
[link] [comments]
( 42
min )
submitted by /u/norcalnatv
[link] [comments]
( 44
min )
submitted by /u/dvilasuero
[link] [comments]
( 42
min )
submitted by /u/erwinyonata
[link] [comments]
( 42
min )
submitted by /u/_sshin_
[link] [comments]
( 45
min )
submitted by /u/iFighting
[link] [comments]
( 43
min )
submitted by /u/BackgroundResult
[link] [comments]
( 40
min )
submitted by /u/qptbook
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
....Instead, it's introducing a new way for people to access the same information. One which can put a major dent in its market share (it’s almost 85% right now).
And Satya says he's willing to accept a "decrease in margins" of the Search business.
https://www.thestatuscode.co/p/the-ultimate-guide-to-the-ai-war
submitted by /u/pyactee
[link] [comments]
( 41
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 45
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/fiachaire27
[link] [comments]
( 40
min )
submitted by /u/Taiva
[link] [comments]
( 40
min )
submitted by /u/LeafsterVR
[link] [comments]
( 40
min )
submitted by /u/pengzhenghao
[link] [comments]
( 41
min )
submitted by /u/jromero12345678910
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
I really want to play with the repo but I'm stuck at the last step of the instructions (https://github.com/lucidrains/musiclm-pytorch#usage-1). If anyone has tips, please let me know!
Here's the issue I have: https://github.com/lucidrains/musiclm-pytorch/issues/13
submitted by /u/BackgroundPass2082
[link] [comments]
( 42
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/hakJav
[link] [comments]
( 41
min )
submitted by /u/vfra32
[link] [comments]
( 41
min )
submitted by /u/the_ferryman_abides
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/PuppetHere
[link] [comments]
( 40
min )
submitted by /u/RushingRobotics_com
[link] [comments]
( 40
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 40
min )
submitted by /u/benbyford
[link] [comments]
( 40
min )
submitted by /u/Legal-Ad-1650
[link] [comments]
( 41
min )
submitted by /u/SpawnOfCthun
[link] [comments]
( 40
min )
submitted by /u/arnolds112
[link] [comments]
( 40
min )
submitted by /u/howardpinsky
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/riiswa
[link] [comments]
( 40
min )
MIT spinout Verta offers tools to help companies introduce, monitor, and manage machine-learning models safely and at scale.
( 10
min )
This post is co-written with Jonathan Jung, Mike Band, Michael Chi, and Thompson Bliss at the National Football League. A coverage scheme refers to the rules and responsibilities of each football defender tasked with stopping an offensive pass. It is at the core of understanding and analyzing any football defensive strategy. Classifying the coverage scheme […]
( 14
min )
The metaverse, a term popularised by science fiction, refers to a shared virtual space where users can interact with each other in a virtual environment. It’s a convergence of real and virtual worlds, creating a new reality that exists simultaneously with the physical world. With the rapid advancement of technology, particularly in the field of… Read More »Metaverse Development: Building the Future of Virtual Reality
The post Metaverse Development: Building the Future of Virtual Reality appeared first on Data Science Central.
( 20
min )
submitted by /u/ziroxonline
[link] [comments]
( 40
min )
submitted by /u/Imagine-your-success
[link] [comments]
( 40
min )
submitted by /u/red3vil96
[link] [comments]
( 41
min )
The following guide provides an independent review of how well this OpenAI detection software performs and how its capabilities stack up against competitors (for finding A!-generated text and plagiarism) OpenAI Text Classifier: ChatGPT’s Own AI Detection - Review
submitted by /u/thumbsdrivesmecrazy
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 44
min )
submitted by /u/sifeliz
[link] [comments]
( 40
min )
submitted by /u/Ok-Craft-9908
[link] [comments]
( 41
min )
submitted by /u/okanaganjournal
[link] [comments]
( 40
min )
submitted by /u/Number_5_alive
[link] [comments]
( 40
min )
submitted by /u/AdministrativeLet996
[link] [comments]
( 41
min )
submitted by /u/BackgroundResult
[link] [comments]
( 40
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 43
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/MindCluster
[link] [comments]
( 40
min )
submitted by /u/victorsevero
[link] [comments]
( 44
min )
Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, and data from any document or image. AnalyzeDocument Signatures is a feature within Amazon Textract that offers the ability to automatically detect signatures on any document. This can reduce the need for human review, custom code, or ML experience. In this post, […]
( 7
min )
Earth’s changing climate poses an increased risk of drought due to global warming. Since 1880, the global temperature has increased 1.01 °C. Since 1993, sea levels have risen 102.5 millimeters. Since 2002, the land ice sheets in Antarctica have been losing mass at a rate of 151.0 billion metric tons per year. In 2022, the […]
( 10
min )
The chatbot’s success on the medical licensing exam shows that the test — and medical education — are flawed, Celi says.
( 8
min )
Would like to hear about what you guys think about this approach?
submitted by /u/ThePerson654321
[link] [comments]
( 43
min )
submitted by /u/nickb
[link] [comments]
( 41
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
Electric automaker XPENG’s flagship G9 SUV and P7 sports sedan are now available for order in Sweden, Denmark, Norway and the Netherlands — an expansion revealed last week at the eCar Expo in Stockholm. The intelligent electric vehicles are built on the high-performance NVIDIA DRIVE Orin centralized compute architecture and deliver AI capabilities that are Read article >
( 5
min )
Designing automotive visualizations can be incredibly time consuming. To make the renders look as realistic as possible, artists need to consider material textures, paints, realistic lighting and reflections, and more. For 3D artist David Baylis, it’s important to include these details and still create high-resolution renders in a short amount of time. That’s why he Read article >
( 6
min )
Venture to the Forgotten Realms this GFN Thursday in Baldur’s Gate 3, streaming on GeForce NOW. Celebrations for the cloud gaming service’s third anniversary continue with a Dying Light 2 reward that’s to die for. It’s the cherry on top of three new titles joining the GeForce NOW library this week. Roll for Initiative Mysterious Read article >
( 5
min )
submitted by /u/Lakshmireddys
[link] [comments]
( 40
min )
submitted by /u/keghn
[link] [comments]
( 40
min )
submitted by /u/joemurray1994
[link] [comments]
( 41
min )
submitted by /u/Flaky_Preparation_50
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Lakshmireddys
[link] [comments]
( 40
min )
submitted by /u/henlo_there_fren
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/joemurray1994
[link] [comments]
( 40
min )
submitted by /u/Fantomas77
[link] [comments]
( 40
min )
submitted by /u/derstarkerwille
[link] [comments]
( 41
min )
submitted by /u/theindianappguy
[link] [comments]
( 41
min )
submitted by /u/nickkgar
[link] [comments]
( 40
min )
Machine learning (ML) has become ubiquitous. Our customers are employing ML in every aspect of their business, including the products and services they build, and for drawing insights about their customers. To build an ML-based application, you have to first build the ML model that serves your business requirement. Building ML models involves preparing the […]
( 15
min )
The first NVIDIA Studio laptops powered by GeForce RTX 40 Series Laptop GPUs are now available, starting with systems from MSI and Razer — with many more to come.
( 8
min )
Critical applications, such as in the medical field, require the rapid
provision of additional information to interpret decisions made by deep
learning methods. In this work, we propose a fast and accurate method to
visualize activations of classification and semantic segmentation networks by
stitching them with a GAN generator utilizing convolutions. We test our
approach on images of animals from the AFHQ wild dataset and real-world digital
pathology scans of stained tissue samples. Our method provides comparable
results to established gradient descent methods on these datasets while running
about two orders of magnitude faster.
( 2
min )
We study online Reinforcement Learning (RL) in non-stationary input-driven
environments, where a time-varying exogenous input process affects the
environment dynamics. Online RL is challenging in such environments due to
catastrophic forgetting (CF). The agent tends to forget prior knowledge as it
trains on new experiences. Prior approaches to mitigate this issue assume task
labels (which are often not available in practice) or use off-policy methods
that can suffer from instability and poor performance.
We present Locally Constrained Policy Optimization (LCPO), an on-policy RL
approach that combats CF by anchoring policy outputs on old experiences while
optimizing the return on current experiences. To perform this anchoring, LCPO
locally constrains policy optimization using samples from experiences that lie
outside of the current input distribution. We evaluate LCPO in two gym and
computer systems environments with a variety of synthetic and real input
traces, and find that it outperforms state-of-the-art on-policy and off-policy
RL methods in the online setting, while achieving results on-par with an
offline agent pre-trained on the whole input trace.
( 2
min )
Bilevel optimization has been developed for many machine learning tasks with
large-scale and high-dimensional data. This paper considers a constrained
bilevel optimization problem, where the lower-level optimization problem is
convex with equality and inequality constraints and the upper-level
optimization problem is non-convex. The overall objective function is
non-convex and non-differentiable. To solve the problem, we develop a
gradient-based approach, called gradient approximation method, which determines
the descent direction by computing several representative gradients of the
objective function inside a neighborhood of the current estimate. We show that
the algorithm asymptotically converges to the set of Clarke stationary points,
and demonstrate the efficacy of the algorithm by the experiments on
hyperparameter optimization and meta-learning.
( 2
min )
Contrary to its original interpretation as a facilitator of knowledge
transfer from one model to another, some recent studies have suggested that
knowledge distillation (KD) is instead a form of regularization. Perhaps the
strongest support of all for this claim is found in its apparent similarities
with label smoothing (LS). This paper investigates the stated equivalence of
these two methods by examining the predictive uncertainties of the models they
train. Experiments on four text classification tasks involving teachers and
students of different capacities show that: (a) In most settings, KD and LS
drive model uncertainty (entropy) in completely opposite directions, and (b) In
KD, the student's predictive uncertainty is a direct function of that of its
teacher, reinforcing the knowledge transfer view.
( 2
min )
This work investigates the intersection of cross modal learning and semi
supervised learning, where we aim to improve the supervised learning
performance of the primary modality by borrowing missing information from an
unlabeled modality. We investigate this problem from a Nadaraya Watson (NW)
kernel regression perspective and show that this formulation implicitly leads
to a kernelized cross attention module. To this end, we propose The Attention
Patch (TAP), a simple neural network plugin that allows data level knowledge
transfer from the unlabeled modality. We provide numerical simulations on three
real world datasets to examine each aspect of TAP and show that a TAP
integration in a neural network can improve generalization performance using
the unlabeled modality.
( 2
min )
There has been much recent progress in forecasting the next observation of a
linear dynamical system (LDS), which is known as the improper learning, as well
as in the estimation of its system matrices, which is known as the proper
learning of LDS. We present an approach to proper learning of LDS, which in
spite of the non-convexity of the problem, guarantees global convergence of
numerical solutions to a least-squares estimator. We present promising
computational results.
( 2
min )
This work investigates the intersection of cross modal learning and semi
supervised learning, where we aim to improve the supervised learning
performance of the primary modality by borrowing missing information from an
unlabeled modality. We investigate this problem from a Nadaraya Watson (NW)
kernel regression perspective and show that this formulation implicitly leads
to a kernelized cross attention module. To this end, we propose The Attention
Patch (TAP), a simple neural network plugin that allows data level knowledge
transfer from the unlabeled modality. We provide numerical simulations on three
real world datasets to examine each aspect of TAP and show that a TAP
integration in a neural network can improve generalization performance using
the unlabeled modality.
( 2
min )
We present a variety of novel information-theoretic generalization bounds for
learning algorithms, from the supersample setting of Steinke & Zakynthinou
(2020)-the setting of the "conditional mutual information" framework. Our
development exploits projecting the loss pair (obtained from a training
instance and a testing instance) down to a single number and correlating loss
values with a Rademacher sequence (and its shifted variants). The presented
bounds include square-root bounds, fast-rate bounds, including those based on
variance and sharpness, and bounds for interpolating algorithms etc. We show
theoretically or empirically that these bounds are tighter than all
information-theoretic bounds known to date on the same supersample setting.
( 2
min )
Message Passing Neural Networks (MPNNs) are instances of Graph Neural
Networks that leverage the graph to send messages over the edges. This
inductive bias leads to a phenomenon known as over-squashing, where a node
feature is insensitive to information contained at distant nodes. Despite
recent methods introduced to mitigate this issue, an understanding of the
causes for over-squashing and of possible solutions are lacking. In this
theoretical work, we prove that: (i) Neural network width can mitigate
over-squashing, but at the cost of making the whole network more sensitive;
(ii) Conversely, depth cannot help mitigate over-squashing: increasing the
number of layers leads to over-squashing being dominated by vanishing
gradients; (iii) The graph topology plays the greatest role, since
over-squashing occurs between nodes at high commute (access) time. Our analysis
provides a unified framework to study different recent methods introduced to
cope with over-squashing and serves as a justification for a class of methods
that fall under `graph rewiring'.
( 2
min )
This work studies the pure-exploration setting for the convex hull
feasibility (CHF) problem where one aims to efficiently and accurately
determine if a given point lies in the convex hull of means of a finite set of
distributions. We give a complete characterization of the sample complexity of
the CHF problem in the one-dimensional setting. We present the first
asymptotically optimal algorithm called Thompson-CHF, whose modular design
consists of a stopping rule and a sampling rule. In addition, we provide an
extension of the algorithm that generalizes several important problems in the
multi-armed bandit literature. Finally, we further investigate the Gaussian
bandit case with unknown variances and address how the Thompson-CHF algorithm
can be adjusted to be asymptotically optimal in this setting.
( 2
min )
The recipe behind the success of deep learning has been the combination of
neural networks and gradient-based optimization. Understanding the behavior of
gradient descent however, and particularly its instability, has lagged behind
its empirical success. To add to the theoretical tools available to study
gradient descent we propose the principal flow (PF), a continuous time flow
that approximates gradient descent dynamics. To our knowledge, the PF is the
only continuous flow that captures the divergent and oscillatory behaviors of
gradient descent, including escaping local minima and saddle points. Through
its dependence on the eigendecomposition of the Hessian the PF sheds light on
the recently observed edge of stability phenomena in deep learning. Using our
new understanding of instability we propose a learning rate adaptation method
which enables us to control the trade-off between training stability and test
set evaluation performance.
( 2
min )
https://www.theverge.com/2023/2/7/23587454/microsoft-bing-edge-chatgpt-ai
submitted by /u/currentscurrents
[link] [comments]
( 44
min )
From Article:
Getty Images new lawsuit claims that Stability AI, the company behind Stable Diffusion's AI image generator, stole 12 million Getty images with their captions, metadata, and copyrights "without permission" to "train its Stable Diffusion algorithm."
The company has asked the court to order Stability AI to remove violating images from its website and pay $150,000 for each.
However, it would be difficult to prove all the violations. Getty submitted over 7,000 images, metadata, and copyright registration, used by Stable Diffusion.
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 49
min )
📢 News 📢
Pythae 0.1.0 is now out and supports distributed training using PyTorch DDP !
Train your favorite Variational Autoencoders (VAEs) faster 🏎️ and on larger datasets, still with a few lines of code 🖥️.
👉github: https://github.com/clementchadebec/benchmark_VAE
👉pypi: https://pypi.org/project/pythae/
https://preview.redd.it/jk4ukkgarpga1.png?width=1335&format=png&auto=webp&s=07c1ab2eaad104879637ad04472935d87baa31e9
submitted by /u/cchad-8
[link] [comments]
( 43
min )
Hey guys, I’m the co-founder of a tech startup focused on providing free AI services. We’re one of the first mobile multipurpose AI apps.
We’ve developed a pretty cool app that offers AI services like image generation, code generation, image captioning, and more for free. We’re sort of like a Swiss Army knife of generative and analytical AI.
We’ve released a new feature called AAIA (Ask AI Anything), which is capable of answering all types of questions, even requests to generate literature, story-lines, jokes, general information, etc.
We’d love to have some people try it out, give us feedback, and keep in touch with us.
https://apps.apple.com/us/app/bright-eye/id1593932475
submitted by /u/BrightEyeuser
[link] [comments]
( 41
min )
submitted by /u/bukowski3000
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/citizentim
[link] [comments]
( 40
min )
submitted by /u/Number_5_alive
[link] [comments]
( 40
min )
submitted by /u/Mogen1000
[link] [comments]
( 42
min )
submitted by /u/ai-lover
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/PuppetHere
[link] [comments]
( 40
min )
submitted by /u/pmigdal
[link] [comments]
( 42
min )
submitted by /u/pentin0
[link] [comments]
( 40
min )
submitted by /u/nowadayswow
[link] [comments]
( 40
min )
https://medium.com/seeds-for-the-future/the-next-step-for-generative-ai-830112890d04?sk=1d6b4c96cc6cb0a4690bcf9df0d12bcc
submitted by /u/arnolds112
[link] [comments]
( 40
min )
submitted by /u/Esportage
[link] [comments]
( 39
min )
submitted by /u/CoolkidRR
[link] [comments]
( 41
min )
submitted by /u/qptbook
[link] [comments]
( 40
min )
submitted by /u/quanik_314
[link] [comments]
( 40
min )
submitted by /u/xWh0am1
[link] [comments]
( 41
min )
submitted by /u/Historical-Pen9653
[link] [comments]
( 41
min )
This post is co-written with Stephen Aylward, Matt McCormick, Brianna Major from Kitware and Justin Kirby from the Frederick National Laboratory for Cancer Research (FNLCR). Amazon SageMaker Studio Lab provides no-cost access to a machine learning (ML) development environment to everyone with an email address. Like the fully featured Amazon SageMaker Studio, Studio Lab allows […]
( 8
min )
Amazon SageMaker has announced the support of three new completion criteria for Amazon SageMaker automatic model tuning, providing you with an additional set of levers to control the stopping criteria of the tuning job when finding the best hyperparameter configuration for your model. In this post, we discuss these new completion criteria, when to use them, and […]
( 8
min )
AI Weirdness: the strange side of machine learning
( 2
min )
Announcements Machine Learning Controversy: From No-Code to No-Math One controversial topic in machine learning circles is code versus no-code. Can you be a real data scientist if you don’t code? Of course you can: You may be leveraging platforms and the code is one or two layers below the responsibilities of your job. Maybe you… Read More »DSC Weekly 7 February 2023 – Machine Learning Controversy: From No-Code to No-Math
The post DSC Weekly 7 February 2023 – Machine Learning Controversy: From No-Code to No-Math appeared first on Data Science Central.
( 21
min )
Data labeling and/or data annotation has long been a critical component of many machine learning and AI initiatives. In recent years, the demand for accurate and reliable data labeling has risen dramatically as the process becomes increasingly vital to the success of numerous projects. But what is data labeling exactly? Data Labeling 2023 – how… Read More »The Impact of Data Labeling 2023: Current Trends & Future Demands
The post The Impact of Data Labeling 2023: Current Trends & Future Demands appeared first on Data Science Central.
( 22
min )
Mobile Apps to Develop Your Data Science Skills -Mobile phones are the most preferred medium of accomplishing minute-to-minutest tasks on a daily basis. We don’t need to visit any particular restaurant to take away the food, we can do this by just sitting on our favorite couch at home, thanks to food ordering apps. Not… Read More »Best 9 Mobile Apps to Develop Your Data Science Skills in 2023
The post Best 9 Mobile Apps to Develop Your Data Science Skills in 2023 appeared first on Data Science Central.
( 23
min )
Doctors rarely make diagnoses based on a single factor — they look at a mix of data types, such as a patient’s symptoms, laboratory and radiology reports, and medical history. VinBrain, a Vietnam-based health-tech startup, is ensuring that AI diagnostics can take a similarly holistic view across vital signs, blood tests, medical images and more. Read article >
( 6
min )
submitted by /u/trcytony
[link] [comments]
( 40
min )
submitted by /u/barrese87
[link] [comments]
( 40
min )
submitted by /u/barrese87
[link] [comments]
( 40
min )
submitted by /u/VR_Angel
[link] [comments]
( 41
min )
submitted by /u/VR_Angel
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Peaking_AI
[link] [comments]
( 40
min )
submitted by /u/shauryadevil
[link] [comments]
( 40
min )
submitted by /u/Zirius_Sadfaces
[link] [comments]
( 40
min )
submitted by /u/magenta_placenta
[link] [comments]
( 40
min )
submitted by /u/Flaky_Preparation_50
[link] [comments]
( 40
min )
submitted by /u/AR_MR_XR
[link] [comments]
( 40
min )
submitted by /u/TheDotnetoffice
[link] [comments]
( 40
min )
submitted by /u/nikesh96
[link] [comments]
( 41
min )
AI Seinfeld Transphobic rant - YouTube
submitted by /u/Status_Signal_4083
[link] [comments]
( 42
min )
submitted by /u/johnGettings
[link] [comments]
( 41
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 41
min )
submitted by /u/DarronFeldstein
[link] [comments]
( 40
min )
submitted by /u/ImplodingCoding
[link] [comments]
( 44
min )
I have made a Stack Overflow post here. I will highly appreciate all your help on this. Thank you!
submitted by /u/Academic-Rent7800
[link] [comments]
( 42
min )
It took me about 46 hours to run this on my 3080 at home. The original files was from the Blu-ray release that was unfortunately pretty poorly done in my opinion. This version really gives it new life I think.
Here's a link to the video result to see for yourself:
https://vimeo.com/796411232
And a link to the model I used!
https://github.com/TencentARC/AnimeSR
submitted by /u/VR_Angel
[link] [comments]
( 43
min )
https://blog.google/technology/ai/bard-google-ai-search-updates/
submitted by /u/EducationalCicada
[link] [comments]
( 50
min )
From the article:
Getty Images has filed a lawsuit in the US against Stability AI, creators of open-source AI art generator Stable Diffusion, escalating its legal battle against the firm.
The stock photography company is accusing Stability AI of “brazen infringement of Getty Images’ intellectual property on a staggering scale.” It claims that Stability AI copied more than 12 million images from its database “without permission ... or compensation ... as part of its efforts to build a competing business,” and that the startup has infringed on both the company’s copyright and trademark protections.
This is different from the UK-based news from weeks ago.
submitted by /u/Wiskkey
[link] [comments]
( 44
min )
I made an image captioning and clustering tools for computer vision and diffusion projects.
You can run almost everything automatically and with a simple CLI command. All contributions are welcome.
https://github.com/cobanov/image-clustering
https://github.com/cobanov/image-captioning
submitted by /u/metover
[link] [comments]
( 42
min )
submitted by /u/ImplodingCoding
[link] [comments]
( 43
min )
submitted by /u/t0ns0fph0t0ns
[link] [comments]
( 44
min )
submitted by /u/imagoons
[link] [comments]
( 42
min )
A new tool brings the benefits of AI programming to a much broader class of problems.
( 8
min )
This blog post is co-written with Bruno Mateus, Jonathan Diedrich and Crispim Tribuna at Talkdesk. Contact centers are using artificial intelligence (AI) and natural language processing (NLP) technologies to build a personalized customer experience and deliver effective self-service support through conversational bots. This is the first of a two-part series dedicated to the integration of […]
( 8
min )
Researchers continue to develop new model architectures for common machine learning (ML) tasks. One such task is image classification, where images are accepted as input and the model attempts to classify the image as a whole with object label outputs. With many models available today that perform this image classification task, an ML practitioner may […]
( 11
min )
“I’ll tell you the problem with the scientific power that you’re using here: it didn’t require any discipline to attain it. You read what others had done and you took the next step. You didn’t earn the knowledge for yourselves, so you don’t take any responsibility for it. You stood on the shoulders of geniuses… Read More »It’s No Big Deal, but ChatGPT Changes Everything – Part III
The post It’s No Big Deal, but ChatGPT Changes Everything – Part III appeared first on Data Science Central.
( 24
min )
Just a few days ago, January 28, we celebrated Data Protection Day, an international event aimed at promoting data privacy and security. In line with the goal of raising awareness about data protection, it would be a good time to discuss data security with Realtime Operating System. This unconventional operating system is widely used, so… Read More »Ensuring Data Security in Realtime Operating System (RTOS) Devices
The post Ensuring Data Security in Realtime Operating System (RTOS) Devices appeared first on Data Science Central.
( 21
min )
A University of Toronto undergrad among an international team of researchers unleashing deep learning in the search for extraterrestrial civilizations.
( 6
min )
submitted by /u/Illustrious_Row_9971
[link] [comments]
( 42
min )
submitted by /u/DenofBlerds
[link] [comments]
( 42
min )
submitted by /u/WarmFormal9881
[link] [comments]
( 42
min )
submitted by /u/jsonathan
[link] [comments]
( 46
min )
Tweet thread: https://twitter.com/WholeMarsBlog/status/1622139178439036928
First impressions: this sucks ass I can only ask about dogs and a few different types of prompts
Does anyone else have experiences to share with this nerfed LaMDA beta google released?
submitted by /u/That_Violinist_18
[link] [comments]
( 44
min )
submitted by /u/Illustrious_Row_9971
[link] [comments]
( 42
min )
https://youtu.be/ktdUeqzzhiA what text to speech does he use? he's been popping up on my yt feed lately and i can see he has different voices in his videos and most of them sound robotic, what do you think it's being used here?
submitted by /u/candidhorse4
[link] [comments]
( 42
min )
submitted by /u/EIDANart
[link] [comments]
( 40
min )
submitted by /u/yikeshardware
[link] [comments]
( 42
min )
submitted by /u/IndependenceFun4627
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 41
min )
submitted by /u/foundersblock
[link] [comments]
( 40
min )
How can we move from an idea to production in AI?
Does the technology readiness levels (TRL) help?
If you want to get some answers please read this article in medium:
https://medium.com/towards-artificial-intelligence/technology-readiness-levels-trl-in-ai-development-c6ed1190fbd6
All the ideas are more than welcome!
submitted by /u/Nice-Tomorrow2926
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 42
min )
submitted by /u/barrese87
[link] [comments]
( 40
min )
Hi all,
For my weekend project I figured I would build an AI driven spiritual successor to Mystery Science Theater 3000... Stop on by and watch the AI characters watch movies and make comments!
Today they are watching "The House on Haunted Hill" and "Plan 9 From Outer Space."
There's still a lot to do but I'm excited to play around with this more and see how it plays out and would love some feedback!
https://twitch.tv/MysteryAItheater
submitted by /u/caseigl
[link] [comments]
( 42
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 54
min )
submitted by /u/BackgroundResult
[link] [comments]
( 41
min )
submitted by /u/shani_786
[link] [comments]
( 41
min )
submitted by /u/insaneintheblain
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/arnolds112
[link] [comments]
( 40
min )
submitted by /u/LincolnOsiris_
[link] [comments]
( 40
min )
submitted by /u/Flaky_Preparation_50
[link] [comments]
( 40
min )
submitted by /u/ai-lover
[link] [comments]
( 41
min )
submitted by /u/visimens-technology
[link] [comments]
( 40
min )
submitted by /u/Tao_Dragon
[link] [comments]
( 41
min )
https://www.udemy.com/course/chatgpt-bot/?couponCode=5-DAYS-FREE
Hey everyone, I recently made a course about ChatGPT as a fun passion project. This is for anyone who wants to learn how to create automated workflows (using Chrome extensions) with ChatGPT. Specifically, you will create a ChatGPT bot that automatically answers your emails. It is beginner friendly and includes getting some good practice with JavaScript. I hope you enjoy it and I'm looking forward to your feedback/questions :)
submitted by /u/neuromodel
[link] [comments]
( 41
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
https://www.youtube.com/watch?v=8TOgN-U0ask&t=1s
After the Lensa AI controversy led many people to question whether AI really is creative or is it just "remixing" other artists' copyrighted work used with permission, it has led many to wonder whether AI trained on copyrighted images should be illegal. This talk makes some interesting comparisons which might just mean the answer is no.
submitted by /u/BearNo21
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 40
min )
submitted by /u/madskills42001
[link] [comments]
( 41
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/keghn
[link] [comments]
( 40
min )
submitted by /u/radi-cho
[link] [comments]
( 44
min )
submitted by /u/adamnemecek
[link] [comments]
( 42
min )
submitted by /u/errorr_unknown
[link] [comments]
( 42
min )
submitted by /u/EmbarrassedHelp
[link] [comments]
( 43
min )
submitted by /u/MysteryInc152
[link] [comments]
( 42
min )
submitted by /u/adt
[link] [comments]
( 42
min )
From the Financial Times: https://www.ft.com/content/583ead66-467c-4bd5-84d0-ed5df7b5bf9c
Unpaywalled: https://archive.is/ciZPV
I guess I'm a little surprised, this feels like Google backing a competitor to 1) their own Google Brain teams, and 2) Deepmind. The cynical take might be that they're trying to lock in Anthropic; the same way Microsoft locked in OpenAI.
submitted by /u/bikeskata
[link] [comments]
( 47
min )
Github: https://github.com/google/vizier
Google AI Blog: https://ai.googleblog.com/2023/02/open-source-vizier-towards-reliable-and.html
Tweet from Zoubin Ghahramani: https://twitter.com/ZoubinGhahrama1/status/1621321675936768000?s=20&t=ZEuz9oSc_GWYxixtXDskqA
submitted by /u/enderlayer
[link] [comments]
( 43
min )
submitted by /u/HamletsLastLine
[link] [comments]
( 46
min )
In this article (https://dallasinnovates.com/exclusive-qa-john-carmacks-different-path-to-artificial-general-intelligence/) there is a quote from John Carmack that read: "I asked Ilya Sutskever, OpenAI’s chief scientist, for a reading list. He gave me a list of like 40 research papers and said, ‘If you really learn all of these, you’ll know 90% of what matters today. "
My question is, what are these 40 papers?
submitted by /u/Gryphx
[link] [comments]
( 42
min )
Can someone please help with this question - https://ai.stackexchange.com/questions/39029/why-does-advantage-learning-help-function-approximators
submitted by /u/Academic-Rent7800
[link] [comments]
( 43
min )
This effort is focused on examining the behavior of reinforcement learning
systems in personalization environments and detailing the differences in policy
entropy associated with the type of learning algorithm utilized. We demonstrate
that Policy Optimization agents often possess low-entropy policies during
training, which in practice results in agents prioritizing certain actions and
avoiding others. Conversely, we also show that Q-Learning agents are far less
susceptible to such behavior and generally maintain high-entropy policies
throughout training, which is often preferable in real-world applications. We
provide a wide range of numerical experiments as well as theoretical
justification to show that these differences in entropy are due to the type of
learning being employed.
( 2
min )
Learning-based behavior prediction methods are increasingly being deployed in
real-world autonomous systems, e.g., in fleets of self-driving vehicles, which
are beginning to commercially operate in major cities across the world. Despite
their advancements, however, the vast majority of prediction systems are
specialized to a set of well-explored geographic regions or operational design
domains, complicating deployment to additional cities, countries, or
continents. Towards this end, we present a novel method for efficiently
adapting behavior prediction models to new environments. Our approach leverages
recent advances in meta-learning, specifically Bayesian regression, to augment
existing behavior prediction models with an adaptive layer that enables
efficient domain transfer via offline fine-tuning, online adaptation, or both.
Experiments across multiple real-world datasets demonstrate that our method can
efficiently adapt to a variety of unseen environments.
( 2
min )
The higher speed, scalability and parallelism offered by ReRAM crossbar
arrays foster development of ReRAM-based next generation AI accelerators. At
the same time, sensitivity of ReRAM to temperature variations decreases
R_on/Roff ratio and negatively affects the achieved accuracy and reliability of
the hardware. Various works on temperature-aware optimization and remapping in
ReRAM crossbar arrays reported up to 58\% improvement in accuracy and
2.39$\times$ ReRAM lifetime enhancement. This paper classifies the challenges
caused by thermal heat, starting from constraints in ReRAM cells' dimensions
and characteristics to their placement in the architecture. In addition, it
reviews available solutions designed to mitigate the impact of these
challenges, including emerging temperature-resilient DNN training methods. Our
work also provides a summary of the techniques and their advantages and
limitations.
( 2
min )
Hierarchical Clustering is a popular unsupervised machine learning method
with decades of history and numerous applications. We initiate the study of
differentially private approximation algorithms for hierarchical clustering
under the rigorous framework introduced by (Dasgupta, 2016). We show strong
lower bounds for the problem: that any $\epsilon$-DP algorithm must exhibit
$O(|V|^2/ \epsilon)$-additive error for an input dataset $V$. Then, we exhibit
a polynomial-time approximation algorithm with $O(|V|^{2.5}/
\epsilon)$-additive error, and an exponential-time algorithm that meets the
lower bound. To overcome the lower bound, we focus on the stochastic block
model, a popular model of graphs, and, with a separation assumption on the
blocks, propose a private $1+o(1)$ approximation algorithm which also recovers
the blocks exactly. Finally, we perform an empirical study of our algorithms
and validate their performance.
( 2
min )
Generative adversarial networks (GANs) have many application areas including
image editing, domain translation, missing data imputation, and support for
creative work. However, GANs are considered 'black boxes'. Specifically, the
end-users have little control over how to improve editing directions through
disentanglement. Prior work focused on new GAN architectures to disentangle
editing directions. Alternatively, we propose GANravel a user-driven direction
disentanglement tool that complements the existing GAN architectures and allows
users to improve editing directions iteratively. In two user studies with 16
participants each, GANravel users were able to disentangle directions and
outperformed the state-of-the-art direction discovery baselines in
disentanglement performance. In the second user study, GANravel was used in a
creative task of creating dog memes and was able to create high-quality edited
images and GIFs.
( 2
min )
Sparseness and robustness are two important properties for many machine
learning scenarios. In the present study, regarding the maximum correntropy
criterion (MCC) based robust regression algorithm, we investigate to integrate
the MCC method with the automatic relevance determination (ARD) technique in a
Bayesian framework, so that MCC-based robust regression could be implemented
with adaptive sparseness. To be specific, we use an inherent noise assumption
from the MCC to derive an explicit likelihood function, and realize the maximum
a posteriori (MAP) estimation with the ARD prior by variational Bayesian
inference. Compared to the existing robust and sparse L1-regularized MCC
regression, the proposed MCC-ARD regression can eradicate the troublesome
tuning for the regularization hyper-parameter which controls the regularization
strength. Further, MCC-ARD achieves superior prediction performance and feature
selection capability than L1-regularized MCC, as demonstrated by a noisy and
high-dimensional simulation study.
( 2
min )
We quantify the parameter stability of a spherical Gaussian Mixture Model
(sGMM) under small perturbations in distribution space. Namely, we derive the
first explicit bound to show that for a mixture of spherical Gaussian $P$
(sGMM) in a pre-defined model class, all other sGMM close to $P$ in this model
class in total variation distance has a small parameter distance to $P$.
Further, this upper bound only depends on $P$. The motivation for this work
lies in providing guarantees for fitting Gaussian mixtures; with this aim in
mind, all the constants involved are well defined and distribution free
conditions for fitting mixtures of spherical Gaussians. Our results tighten
considerably the existing computable bounds, and asymptotically match the known
sharp thresholds for this problem.
( 2
min )
submitted by /u/justLV
[link] [comments]
( 40
min )
submitted by /u/arnolds112
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/visimens-technology
[link] [comments]
( 40
min )
submitted by /u/HODLTID
[link] [comments]
( 40
min )
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 41
min )
submitted by /u/DANGERD0OM
[link] [comments]
( 40
min )
submitted by /u/SonntagMorgen
[link] [comments]
( 40
min )
submitted by /u/BackgroundResult
[link] [comments]
( 40
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 44
min )
submitted by /u/anekii
[link] [comments]
( 40
min )
submitted by /u/HooverHooverHoober
[link] [comments]
( 40
min )
Today, the NFL is continuing their journey to increase the number of statistics provided by the Next Gen Stats Platform to all 32 teams and fans alike. With advanced analytics derived from machine learning (ML), the NFL is creating new ways to quantify football, and to provide fans with the tools needed to increase their […]
( 10
min )
The National Football League (NFL) is one of the most popular sports leagues in the United States and is the most valuable sports league in the world. The NFL, BioCore, and AWS are committed to advancing human understanding around the diagnosis, prevention, and treatment of sports-related injuries to make the game of football safer. More […]
( 10
min )
I wanted to use the Learnable Trainangulation model in a commercial project. The source code itself is under MIT licensing. However, the dataset they have used is Human3.6M, which states that the license is "FREE OF CHARGE FOR ACADEMIC USE ONLY".
Yet, recent court rulings (in the US) state that models can use copyrighted data during training, and the results are no longer bound by that copyright (e.g. Google Books). Does the same apply here?
submitted by /u/mfarahmand98
[link] [comments]
( 42
min )
Cheers to another year of cloud gaming! GeForce NOW celebrates its third anniversary with a look at how far cloud gaming has come, a community celebration and 25 new games supported in February. Members can celebrate all month long, starting with a sweet Dying Light 2 reward and support for nine more games this week, Read article >
( 7
min )
NVIDIA A100 Tensor Core GPUs running on Supermicro servers have captured leading results for inference in the latest STAC-ML Markets benchmark, a key technology performance gauge for the financial services industry. The results show NVIDIA demonstrating unrivaled throughput — serving up thousands of inferences per second on the most demanding models — and top latency Read article >
( 6
min )
For several years, NVIDIA has been working with some of the world’s leading financial institutions to develop and execute a wide range of rapidly evolving AI strategies. For the past three years, we’ve asked them to tell us collectively what’s on the top of their minds. Sometimes the results are just what we thought they’d Read article >
( 6
min )
submitted by /u/Alyx1337
[link] [comments]
( 40
min )
https://www.axios.com/2023/02/01/chatgpt-subscriptions-chatbot-openai
Not fully paywalled, but there's a tiering system.
submitted by /u/bikeskata
[link] [comments]
( 42
min )
GitHub (sadly without weights). https://github.com/PetchMa/ML_GBT_SETI
News.
https://www-scinexx-de.translate.goog/news/kosmos/seti-findet-acht-potenzielle-alien-signale/?_x_tr_sl=de&_x_tr_tl=en&_x_tr_hl=de&_x_tr_pto=wapp
submitted by /u/logTom
[link] [comments]
( 44
min )
submitted by /u/rafs2006
[link] [comments]
( 41
min )
submitted by /u/much_successes
[link] [comments]
( 40
min )
submitted by /u/ExperienceKCC
[link] [comments]
( 40
min )
submitted by /u/Calatravo
[link] [comments]
( 40
min )
submitted by /u/GlobeOpinion
[link] [comments]
( 40
min )
submitted by /u/Calatravo
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
This is laughable.They were sitting on all of the technology.And now they scramble to do something better than 10 links.I for myself will be disappointed with anything less than movie Her.
It's a high bar.May be.I would not expect personality.May be some rudementary memory.But the ability to perform almost any digital task must be there.It can be built in a garage using open source projects.COME ON.Some good programmers and hackathon.Yes I am waiting for stability ai model.Or may be gpt 3 API can be used.But
submitted by /u/nikitastaf1996
[link] [comments]
( 41
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 40
min )
submitted by /u/BackgroundResult
[link] [comments]
( 40
min )
In this blog post, we will take a closer look at the implications of ChatGPT’s authorship, the role of AI in scientific literature, and…
Continue reading on Becoming Human: Artificial Intelligence Magazine »
( 8
min )
Linear & Logistic: The Relationship Between Regression Models
Continue reading on Becoming Human: Artificial Intelligence Magazine »
( 11
min )
Hello and welcome to the blog! My name is ChatGPT, and I am a large language model trained by OpenAI.
P.S. This article includes a use…
( 9
min )
More than $1 million in funding available to selected Solver teams and fellows.
( 7
min )
Almost 80% of today’s web content is user-generated, creating a deluge of content that organizations struggle to analyze with human-only processes. The availability of consumer information helps them make decisions, from buying a new pair of jeans to securing home loans. In a recent survey, 79% of consumers stated they rely on user videos, comments, […]
( 10
min )
Recent developments in deep learning have led to increasingly large models such as GPT-3, BLOOM, and OPT, some of which are already in excess of 100 billion parameters. Although larger models tend to be more powerful, training such models requires significant computational resources. Even with the use of advanced distributed training libraries like FSDP and […]
( 11
min )
submitted by /u/keghn
[link] [comments]
( 40
min )
Hi guys,
I have made a video on YouTube here where I explain how deltas and delta-deltas features are computed. These are used quite a lot in speech recognition systems.
I hope it may be of use to some of you out there. As always, feedback is more than welcomed! :)
submitted by /u/Personal-Trainer-541
[link] [comments]
( 41
min )
Things are a lot sunnier these days for designers looking to visualize their projects in NVIDIA Omniverse, a platform for creating and operating metaverse applications.
( 6
min )
Artificial intelligence is the new electricity. The fifth industrial revolution. And companies that go all-in on AI are reaping the rewards. So how do you make that happen? That big question — how? — is explored by Nitin Mittal, principal at Deloitte, one of the world’s largest professional services organizations, and co-author Thomas Davenport in Read article >
( 4
min )
Openai is developing a new tool to help distinguish between AI-written and human-written text. Here is an unofficial python wrapper of openai model to detect if a text is written by #chatgpt , #gpt3 , #gpt etc
Github: https://github.com/promptslab/openai-detector
https://preview.redd.it/f45ggu45tgfa1.png?width=1122&format=png&auto=webp&s=4cb5ae70d7194cc3c070f3ad2dcbc968a804d4a3
submitted by /u/StoicBatman
[link] [comments]
( 42
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/pasticciociccio
[link] [comments]
( 40
min )
submitted by /u/jrowley
[link] [comments]
( 40
min )
submitted by /u/SpeaKrLipSync
[link] [comments]
( 41
min )
submitted by /u/citizentim
[link] [comments]
( 41
min )
submitted by /u/much_successes
[link] [comments]
( 40
min )
submitted by /u/bukowski3000
[link] [comments]
( 40
min )
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 40
min )
submitted by /u/Tao_Dragon
[link] [comments]
( 40
min )
I would like to invite interested people to collaborate on this hobby project of mine.
This is still in an early-stage, and I believe it can be significantly improved together.
The GitHub repository link is here: https://github.com/kayuksel/multi-rl-crowd-sim
Note: The difference from StarCraft is that Dragons can hide behind each other.
They also reduce their strength of hitting, propotional to decrease of their health.
https://preview.redd.it/wrpcaz782dfa1.png?width=640&format=png&auto=webp&s=1dede69acb78e874a80bd532af85b269c7117f9f
submitted by /u/k_yuksel
[link] [comments]
( 41
min )
Analyst reports. Academic papers. Ph.D. programs. There are a lot of places you can go to get a glimpse of the future. But the best place might just be El Coyote Cojo, a whiskey-soaked dive bar that doesn’t exist in real life. Fire up Cyberpunk 2077 and you’ll see much more than the watering hole’s Read article >
( 6
min )
Broadcasters have an arsenal of new features and technologies at their disposal; the eighth-generation NVIDIA video encoder on RTX 40 Series GPUs with support for the open AV1 video-coding format; new NVIDIA Broadcast app effects like Eye Contact and Vignette; and support for AV1 streaming in Discord.
( 7
min )
We’re launching a classifier trained to distinguish between AI-written and human-written text.
We’ve trained a classifier to distinguish between text written by a human and text written by AIs from a variety of providers. While it is impossible to reliably detect all AI-written text, we believe
( 3
min )
Announcements Data Models for the Weather With January coming to an end, we here in the Northeast let out a collective sigh of relief as the month ends without any major snowstorms that tend to happen in the first month of the year. Weather forecasting is a centuries-old practice that has its roots in divination… Read More »DSC Weekly 31 January 2023 – Data Models for the Weather
The post DSC Weekly 31 January 2023 – Data Models for the Weather appeared first on Data Science Central.
( 19
min )
In the previous article, we looked at two Ever-Successful NFL teams, the Kansas City Chiefs and the San Francisco 49ers, who seem to be able to win consistently even while things change around them and players and coaches come and go. Then, we looked at two Never-Successful teams, the Arizona Cardinals and the Cleveland Browns,… Read More »Exploding vs. Imploding: What the NFL Has to Teach Us About Managing Agile Enterprises, Part II
The post Exploding vs. Imploding: What the NFL Has to Teach Us About Managing Agile Enterprises, Part II appeared first on Data Science Central.
( 26
min )
submitted by /u/trcytony
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
I'v been thinking a lot about Marshall McLuhan and his 4 laws of media. Specifically, the one that states that all new forms of media cause something to be retrieved from the past. What will ChatGPT and AI revive and retrieve? I put some more thoughts in my blog. Would love to hear your thoughts on it.
https://bobhutchins.substack.com/p/what-media-format-will-chatgpt-and
submitted by /u/Interesting_Status64
[link] [comments]
( 41
min )
submitted by /u/Number_5_alive
[link] [comments]
( 40
min )
submitted by /u/qptbook
[link] [comments]
( 40
min )
submitted by /u/robbinpetertopaypaul
[link] [comments]
( 41
min )
submitted by /u/CyborgWriter
[link] [comments]
( 41
min )
1D: MusicLM, VALL-E
2D: Stable Diffusion, DALL-E, MidJourney
3D (or 2+1D): Imagen-video, Phenaki
3D: Magic3D, DreamFusion, Point-E
4D (or 3+1D): Make-A-Video-3D
[Searchcolab] What’s next? 🤔
https://reddit.com/link/10p6vw9/video/gqbnrsaxh7fa1/player
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 41
min )
submitted by /u/Itchy0101
[link] [comments]
( 40
min )
https://ainewsbase.com/google-musiclm-copyright-issues-not-releasing/
The samples they do show might just sound weird because of the stored file or whatever but the sound definitely sounds kinda weird.
submitted by /u/SPEEDYFISHY2000
[link] [comments]
( 40
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 41
min )
submitted by /u/tinylobsta
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 42
min )
submitted by /u/annal201
[link] [comments]
( 40
min )
submitted by /u/_utisz_
[link] [comments]
( 40
min )
https://peltarion.com/blog/data-science/towards-a-token-free-future-in-nlp
submitted by /u/EducationalCicada
[link] [comments]
( 42
min )
I’m an ML Engineer at Hive AI and I’ve been working on a ChatGPT Detector.
Here is a free demo we have up: https://hivemoderation.com/ai-generated-content-detection
From our benchmarks it’s significantly better than similar solutions like GPTZero and OpenAI’s GPT2 Output Detector. On our internal datasets, we’re seeing balanced accuracies of >99% for our own model compared to around 60% for GPTZero and 84% for OpenAI’s GPT2 Detector.
Feel free to try it out and let us know if you have any feedback!
submitted by /u/qthai912
[link] [comments]
( 56
min )
In order to improve my talking skills, I am doing a little series on how to setup Stable Diffusion on Paperspace, and I am astounded how much time it takes to do the audio editing. Well, part of the reason is that I've only been doing this for 3 days and my process is very inefficient, but it feels that in the current time, neural nets should be able to do things like remove uhms, lip smacking and breath intakes.
I've looked around, and this post from 9 years ago says the only choice is to edit it by hand. Is that still true?
submitted by /u/abstractcontrol
[link] [comments]
( 43
min )
From the given link!, I gather that it is a large-scale Transformer trained to use digital tools like a web browser. Right now, it’s hooked up to a Chrome extension which allows it to observe what’s happening in the browser and take certain actions, like clicking, typing, and scrolling, etc.
I am interested in knowing the broad steps involved in building something like this.
submitted by /u/smred123
[link] [comments]
( 43
min )
https://github.com/tysam-code/hlb-CIFAR10
submitted by /u/tysam_and_co
[link] [comments]
( 53
min )
During the 1970s, Ethernet pioneer and 3Com Internet equipment company founder Bob Metcalfe was working on something called the “Data Reconfiguration Service” for the early Internet. “It was an effort to write a special purpose programming language to convert data formats, Metcalfe said during a 2021 OriginTrail.io panel session. “And the goal was so that… Read More »Enabling contextual computing in today’s enterprise information fabrics
The post Enabling contextual computing in today’s enterprise information fabrics appeared first on Data Science Central.
( 21
min )
Amazon SageMaker provides a suite of built-in algorithms, pre-trained models, and pre-built solution templates to help data scientists and machine learning (ML) practitioners get started on training and deploying ML models quickly. You can use these algorithms and models for both supervised and unsupervised learning. They can process various types of input data, including tabular, […]
( 12
min )
Amazon Forecast is a fully managed service that uses machine learning (ML) to generate highly accurate forecasts, without requiring any prior ML experience. Forecast is applicable in a wide variety of use cases, including estimating supply and demand for inventory management, travel demand forecasting, workforce planning, and computing cloud infrastructure usage. You can use Forecast […]
( 10
min )
submitted by /u/DragonLord9
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
Things the video covers:
What is intelligence? What is A.I.? What is the best currently available and what are the benefits? How does it work? What are the downsides? The increasing speed of human technological advancement Why A.I. actually terrifies me! (Some scenarios)
I hope you enjoy it!
submitted by /u/casualbob_uk
[link] [comments]
( 41
min )
submitted by /u/lshic
[link] [comments]
( 40
min )
https://youtu.be/Y6gXZ61NnOE
submitted by /u/sigmabruuh
[link] [comments]
( 42
min )
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 40
min )
submitted by /u/Zurevu
[link] [comments]
( 40
min )
submitted by /u/lfogliantis
[link] [comments]
( 46
min )
submitted by /u/nikko_fan
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 40
min )
submitted by /u/sidianmsjones
[link] [comments]
( 40
min )
submitted by /u/aquin1313
[link] [comments]
( 40
min )
submitted by /u/SupPandaHugger
[link] [comments]
( 40
min )
submitted by /u/how-it-is-
[link] [comments]
( 42
min )
submitted by /u/helliun
[link] [comments]
( 44
min )
submitted by /u/Illustrious_Row_9971
[link] [comments]
( 45
min )
submitted by /u/tomiwa1a
[link] [comments]
( 44
min )
submitted by /u/anitakirkovska
[link] [comments]
( 40
min )
submitted by /u/henlo_there_fren
[link] [comments]
( 40
min )
submitted by /u/SnowDustHD
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/nikko_fan
[link] [comments]
( 40
min )
submitted by /u/Imagine-your-success
[link] [comments]
( 41
min )
submitted by /u/Your_mag
[link] [comments]
( 42
min )
submitted by /u/HODLTID
[link] [comments]
( 45
min )
I recently created a website called https://cashwithai.com that is dedicated to helping people learn how to make money using AI like ChatGPT. The website offers a variety of resources, including a QuickStart guide, case studies, and tips and tricks for monetizing AI-generated content.
Additionally, I'm offering free 1-on-1 consultations to anyone who is looking for personalized advice and guidance on how to make money with AI. I'm not running ads or charging; I run purely off donations.
Let me know if you have any questions!
submitted by /u/Chadcash
[link] [comments]
( 41
min )
submitted by /u/SupPandaHugger
[link] [comments]
( 40
min )
submitted by /u/lambolifeofficial
[link] [comments]
( 40
min )
submitted by /u/VNKT-FOREVER
[link] [comments]
( 40
min )
submitted by /u/lambolifeofficial
[link] [comments]
( 40
min )
submitted by /u/tanelai
[link] [comments]
( 43
min )
submitted by /u/yazriel0
[link] [comments]
( 45
min )
Hi everyone, I made a JupyterLab extension to use OpenAI’s GPT models for code and text completion on your notebook cells.
This extension passes your current notebook cell to the GPT API and completes your code/text for you. You can customize the GPT parameters in the Advanced Settings menu.
I made this extension when I couldn't find any Copilot/Codex extensions for JupyterLab. It doesn't make sense that ML folks don't have an easy way to use AI generated code in their own tools. VS Code does allow you use Copilot, but I've gotten used to Jupyter and a lot of ML/DS folks I know still prefer using Jupyter over VS code.
Installation
pip install gpt_jupyterlab
GitHub Repo: https://github.com/henshinger/gpt-jupyterlab/
Demo
GPT JupyterLab Demo
Note: You will need your own OpenAI API Key to use this extension.
Would love to get your feedback!
submitted by /u/henshinger
[link] [comments]
( 44
min )
I am building an open-source ML observability and refinement toolkit which recently got investment from YCombinator.
The tool helps ML practitioners to: 1. Understand how their models are performing in production 2. Catch edge-cases and outliers to help them refine their models 3. Allow them to customise the tool according to their needs (hence, open-source) 4. Bring data-security at the forefront (hence, self hosted)
You can check out the project https://github.com/uptrain-ai/uptrain and would love to hear feedback from the community
submitted by /u/Vegetable-Skill-9700
[link] [comments]
( 43
min )
https://pypi.org/project/rwkvstic/
Currently supports tensorflow, pytorch, jax
Also has support for tensor streaming, 8bit jit-quant and multi-gpu.
Run RWKV 7B on 8GB of vram or 14B on 16GB of vram.
submitted by /u/hazardous1222
[link] [comments]
( 42
min )
submitted by /u/gwern
[link] [comments]
( 40
min )
submitted by /u/bperki8
[link] [comments]
( 40
min )
Bright Eye: mobile AI app that generates art, code, poems, essays, short stories, answers questions, and more!
Hey guys, I’m the cofounder of a tech startup focused on providing free AI services. We’re one of the first mobile multipurpose AI apps.
We’ve developed a pretty cool app that offers AI services like image generation, code generation, image captioning, and more for free. We’re sort of like a Swiss Army knife of generative and analytical AI.
We’ve released a new feature called AAIA(Ask AI Anything), which is capable of answering all types of questions, even requests to generate literature, storylines, answer questions and more, (think of chatgpt).
We’d love to have some people try it out, give us feedback, and keep in touch with us.
https://apps.apple.com/us/app/bright-eye/id1593932475
submitted by /u/SonnyDoge22
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Tao_Dragon
[link] [comments]
( 40
min )
submitted by /u/foundersblock
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/Number_5_alive
[link] [comments]
( 40
min )
https://www.youtube.com/watch?v=Vw-t826JcDQ
submitted by /u/Optimal_Studio_2050
[link] [comments]
( 40
min )
Specifically this one:
https://www.youtube.com/watch?v=MFv7apjatwM&ab_channel=Lux-Topic
If there is no current AI that is able to listen to a song and write down the lyrics accurately, then I provide this idea freely.
submitted by /u/A_Very_Horny_Zed
[link] [comments]
( 40
min )
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 40
min )
submitted by /u/PuppetHere
[link] [comments]
( 40
min )
submitted by /u/BackgroundResult
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/mhczbnoykrqvzazfth
[link] [comments]
( 41
min )
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 41
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/KTMark
[link] [comments]
( 40
min )
submitted by /u/DANGERD0OM
[link] [comments]
( 40
min )
submitted by /u/pasticciociccio
[link] [comments]
( 40
min )
University of Florida - Warrington College of Business's Mo Wang offers advice for the future of work.
Full Story: https://explore.research.ufl.edu/the-future-of-work.html#ai-hiring
submitted by /u/ufexplore
[link] [comments]
( 40
min )
submitted by /u/walt74
[link] [comments]
( 45
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/much_successes
[link] [comments]
( 40
min )
submitted by /u/TheRPGGamerMan
[link] [comments]
( 43
min )
submitted by /u/BackgroundResult
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/bobsandalex
[link] [comments]
( 40
min )
submitted by /u/25dopren
[link] [comments]
( 40
min )
I'm looking into projects which augment the RLHF training approach of chatGPT with explicit rules, such as in https://paperswithcode.com/paper/constitutional-ai-harmlessness-from-ai.
Ideally there would be both rules and priority levels between the rules, similarly to the Asimov laws of robotics.
The Open-Assistant project (https://github.com/LAION-AI/Open-Assistant) captures the spirit, but it is looking to replicate chatGPT at the moment.
submitted by /u/lorepieri
[link] [comments]
( 42
min )
Find the release notes here:
https://github.com/nnaisense/evotorch/releases/tag/v0.4.0
A big highlight is how fast these implementations are! I genuinely believe GPU-acceleration is the future of Evolutionary algorithms, and EvoTorch and its integration into the PyTorch ecosystem is a fantastic enabler for this.
To demonstrate the raw speed provided by the new release, I compared EvoTorch's CMA-ES implementation to that provided by the popular pycma package on the 80-dimensional Rastrigin problem and tracked the run-time:
Performance was measured over 50 runs on the 80-dimensional Rastrigin problem
The crazy thing to note is that when we switch to GPU (Tesla V100), we can efficiently run CMA-ES with population sizes going into 100k+!
submitted by /u/NaturalGradient
[link] [comments]
( 45
min )
submitted by /u/gwern
[link] [comments]
( 40
min )
Could someone please help with this - https://ai.stackexchange.com/questions/38894/are-there-papers-that-do-an-empirical-investigation-on-drl-hyperparameters
submitted by /u/Academic-Rent7800
[link] [comments]
( 41
min )
submitted by /u/gwern
[link] [comments]
( 40
min )
This post is co-authored by Tristan Miller from Best Egg. Best Egg is a leading financial confidence platform that provides lending products and resources focused on helping people feel more confident as they manage their everyday finances. Since March 2014, Best Egg has delivered $22 billion in consumer personal loans with strong credit performance, welcomed […]
( 8
min )
GeForce NOW RTX 4080 SuperPODs are rolling out now, bringing RTX 4080-class performance and features to Ultimate members — including support for NVIDIA Ada Lovelace GPU architecture technologies like NVIDIA DLSS 3. This GFN Thursday brings updates to some of GeForce NOW’s hottest games that take advantage of these amazing technologies, all from the cloud. Read article >
( 6
min )
submitted by /u/bobsandalex
[link] [comments]
( 40
min )
submitted by /u/MSAPW
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Repeat-or
[link] [comments]
( 40
min )
submitted by /u/Phishstixxx
[link] [comments]
( 40
min )
submitted by /u/MrsChenHW
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/CeFurkan
[link] [comments]
( 40
min )
submitted by /u/SAT0725
[link] [comments]
( 40
min )
submitted by /u/Imagine-your-success
[link] [comments]
( 41
min )
submitted by /u/lambolifeofficial
[link] [comments]
( 40
min )
Hello everyone,
I am a student in AP Research. For my project, I am conducting a survey to analyze the connection between science fiction and technology (specifically Artificial Intelligence). This survey (linked) asks a few questions about your knowledge of Sci-fi, Artificial Intelligence, and the connection between the two. It should not take more than 10 minutes of your time.
If you are interested, the link to the form is below:
https://docs.google.com/forms/d/e/1FAIpQLScY_VaNI-CEtTiJiLHgYCCguEZ7m9DUdQoxvFTjXFFLOGu2KA/viewform
If you have additional questions, my email is in the linked google form. Thank you for your participation, it is deeply appreciated!
submitted by /u/rsantos05
[link] [comments]
( 41
min )
In this video I explain about INSTRUCTOR, an instruction-finetuned text embedding model that can generate text embeddings tailored to any task (e.g., classification, retrieval, clustering, text evaluation, etc.) and domains (e.g., science, finance, etc.) by simply providing the task instruction, without any finetuning. Instructor achieves sota on 70 diverse embedding tasks! I also show a google collab demo of instructor
https://youtu.be/vg38cq3KJ6M
submitted by /u/Sea-Photo5230
[link] [comments]
( 42
min )
submitted by /u/araffin2
[link] [comments]
( 40
min )
In the context of digital transformation and innovation, there is no lack of “hot topics” to discuss. Emerging technologies are truly emerging everywhere. What is most exciting – and what demonstrates their greatest promise – is that these new technologies are converging to produce innovative new businesses, products, and services. Over the past decade, we… Read More »Innovation at the Convergence of Emerging Technologies: Business at the Edge
The post Innovation at the Convergence of Emerging Technologies: Business at the Edge appeared first on Data Science Central.
( 22
min )
In a recent article on Autonomous Intelligent Systems (AIS) [1], Ajit Joakar described various features and characteristics of such systems, including associated technologies and research areas, building blocks and core elements, critical factors for success, and cross-cutting enablers. He introduces AIS as an “emerging interdisciplinary field that deals with situations where humans interact with AI systems… Read More »Five Principles of Safe Driving in AIS (Autonomous Intelligent Systems)
The post Five Principles of Safe Driving in AIS (Autonomous Intelligent Systems) appeared first on Data Science Central.
( 23
min )
We stand at the threshold of a new era of precision medicine, where health and life sciences data hold the potential to dramatically propel and expand our understanding and treatment of human disease. One of the tools that we believe will help to enable precision medicine is Terra, the secure biomedical research platform co-developed by […]
The post Biomedical Research Platform Terra Now Available on Microsoft Azure appeared first on Microsoft Research.
( 9
min )
Today, gaining customer loyalty cannot be a one-off thing. A brand needs a focused and integrated plan to retain its best customers—put simply, it needs a customer loyalty program. Earn and burn programs are one of the main paradigms. A typical earn and burn program rewards customers after a certain number of visits or spend. […]
( 7
min )
Model explainability refers to the process of relating the prediction of a machine learning (ML) model to the input feature values of an instance in humanly understandable terms. This field is often referred to as explainable artificial intelligence (XAI). Amazon SageMaker Clarify is a feature of Amazon SageMaker that enables data scientists and ML engineers […]
( 10
min )
In November 2022, we announced that AWS customers can generate images from text with Stable Diffusion models in Amazon SageMaker JumpStart. Today, we announce a new feature that lets you upscale images (resize images without losing quality) with Stable Diffusion models in JumpStart. An image that is low resolution, blurry, and pixelated can be converted […]
( 10
min )
As its name suggests, Orbital Sidekick is creating technology that acts as a buddy in outer space, keeping an eye on the globe using satellites to help keep it safe and sustainable. The San Francisco-based startup, a member of the NVIDIA Inception program, enables commercial and government users to optimize sustainable operations and security with Read article >
( 6
min )
Jensen Huang headlines Stockholm AI confab, Berzelius supercomputer upgraded to 94 NVIDIA DGX A100 systems.
( 6
min )
submitted by /u/Diligent-Rub-9207
[link] [comments]
( 40
min )
submitted by /u/ScornfulSkate
[link] [comments]
( 40
min )
submitted by /u/nikesh96
[link] [comments]
( 40
min )
submitted by /u/Ill-Poet-3298
[link] [comments]
( 40
min )
submitted by /u/tomd_96
[link] [comments]
( 40
min )
submitted by /u/Queen__Antifa
[link] [comments]
( 40
min )
submitted by /u/digitalgoldnow
[link] [comments]
( 40
min )
submitted by /u/theindianappguy
[link] [comments]
( 41
min )
submitted by /u/Marinuch
[link] [comments]
( 40
min )
submitted by /u/Microsis
[link] [comments]
( 40
min )
submitted by /u/OnlyProggingForFun
[link] [comments]
( 40
min )
submitted by /u/DarronFeldstein
[link] [comments]
( 42
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
Hey, everyone!
I'd like to show you an experiment that we did with ChatGPT - we generated about 1000 resumes of famous people. Each resume is being generated from a single ChatGPT prompt - no human input was done to the resumes other than the prompt and it's the same prompt for every resume - the only difference is the name of the person.
Here's a preview: https://thisresumedoesnotexist.com/
I'd like to hear your thoughts as it's in a very early stage and there's a lot of work to be done.
submitted by /u/deepsyx
[link] [comments]
( 41
min )
submitted by /u/HODLTID
[link] [comments]
( 41
min )
submitted by /u/lambolifeofficial
[link] [comments]
( 40
min )
Announcements When AI Gets Going, the Going Gets Weird Last week, Microsoft announced its third investment in OpenAI. This time it’s a multi-billion dollar deal, with plans to harness OpenAI’s ChatGPT in Microsoft’s product lines, including Bing. I’m smiling as I’m typing because I’m still thinking about Bill Schmarzo’s lead in Part 1 of 2… Read More »DSC Weekly 24 January 2023 – When AI Gets Going, the Going Gets Weird
The post DSC Weekly 24 January 2023 – When AI Gets Going, the Going Gets Weird appeared first on Data Science Central.
( 20
min )
Warehouse robotics is witnessing steady growth, driven by the increasing adoption of automated solutions in storage for food and beverages, consumer goods, retail, and third-party logistics. The collaboration between the e-commerce sector and warehouse robotics is also a major driver of this market, as it allows for developing increasingly sophisticated warehouse automation systems. Additionally, the… Read More »Revolutionizing the Supply Chain: Developments in the Warehouse Robotics Industry
The post Revolutionizing the Supply Chain: Developments in the Warehouse Robotics Industry appeared first on Data Science Central.
( 20
min )
In Part I of the blog series “It’s No Big Deal, but ChatGPT Changes Everything”, we were introduced into the world of ChatGPT, chatbots, and generative Artificial Intelligence (AI). We ended Part I by giving ChatGPT a test run, by asking it “What would be a great vacation place for my family?” that gives us… Read More »It’s No Big Deal, but ChatGPT Changes Everything – Part II
The post It’s No Big Deal, but ChatGPT Changes Everything – Part II appeared first on Data Science Central.
( 24
min )
Sweden is outfitting its AI supercomputer for a journey to the cutting edge of machine learning, robotics and healthcare. It couldn’t ask for a better guide than Anders Ynnerman (above). His signature blue suit, black spectacles and gentle voice act as calm camouflage for a pioneering spirit. Early on, he showed a deep interest in Read article >
( 6
min )
Artist Ducky 3D creates immersive experiences through vibrant visuals and beautiful 3D environments in the alien-inspired animation Stylized Alien Landscape — this week In the NVIDIA Studio.
( 6
min )
Now I wrote that the bot would use the Markov Decision Process. Would that be correct? And if not why?
submitted by /u/ScaryTerryBiiittch
[link] [comments]
( 43
min )
submitted by /u/DarronFeldstein
[link] [comments]
( 40
min )
submitted by /u/Imagine-your-success
[link] [comments]
( 43
min )
submitted by /u/tiziano_is_fine
[link] [comments]
( 40
min )
submitted by /u/PuppetHere
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/EnvironmentalRadio73
[link] [comments]
( 40
min )
submitted by /u/Phishstixxx
[link] [comments]
( 40
min )
submitted by /u/Acid_God_
[link] [comments]
( 40
min )
submitted by /u/BackgroundResult
[link] [comments]
( 42
min )
submitted by /u/bendee983
[link] [comments]
( 40
min )
Hi, I have created this tutorial on how to train yolov8 object detection model on a custom dataset. Please have a look at: https://youtu.be/ZzC3SJJifMg
submitted by /u/coder4mzero
[link] [comments]
( 40
min )
submitted by /u/Calatravo
[link] [comments]
( 40
min )
This blog contains type of joins like Inner join, Left join, Right join , Full join, Self join and Cross join.
( 6
min )
This blog contains Window function in SQL like (Rank, Dense_Rank, Row_Number , Lead, Lag) .
( 8
min )
ChatGPT, a language model developed by OpenAI, has the potential to revolutionize a wide range of industries and change the way we…
Continue reading on Becoming Human: Artificial Intelligence Magazine »
( 9
min )
Human resource management (HRM) is a critical aspect of any organization as it involves managing the workforce and ensuring that their…
( 8
min )
Who are we?
( 8
min )
Data Science
( 14
min )
Deodel is a Python implementation of a classifier with native support for mixed attributes data. It features good accuracy, especially with heterogenous attributes. It even supports mixing of continuous and nominal values in the same attribute column.
https://github.com/c4pub/deodel
submitted by /u/zx2zx
[link] [comments]
( 42
min )
submitted by /u/cygn
[link] [comments]
( 41
min )
Computer scientists want to know the exact limits in our ability to clean up, and reconstruct, partly blurred images.
( 9
min )
submitted by /u/vwxyzjn
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/Acid_God_
[link] [comments]
( 40
min )
Add AI to the list of defenses against identity attacks, one of the most common and hardest breach to prevent. More than 40% of all data compromises involved stolen credentials, according to the 2022 Verizon Data Breach Investigations Report. And a whopping 80% of all web application breaches involved credential abuse. “Credentials are the favorite Read article >
( 6
min )
For the past 500 years, the National Library of Sweden has collected virtually every word published in Swedish, from priceless medieval manuscripts to present-day pizza menus. Thanks to a centuries-old law that requires a copy of everything published in Swedish to be submitted to the library — also known as Kungliga biblioteket, or KB — Read article >
( 6
min )
We're happy to announce that OpenAI and Microsoft are extending our partnership.
This multi-year, multi-billion dollar investment from Microsoft follows their previous investments in 2019 and 2021, and will allow us to continue our independent research and develop AI that is increasingly safe, useful, and powerful.
In pursuit
( 2
min )
submitted by /u/_kevin00
[link] [comments]
( 43
min )
It’s an early version and I’m trying to get some feedback on how I can improve this and do it the “right way”.
Source Code and Results: https://github.com/prabhuomkar/bitbeast/tree/master/ptibench
submitted by /u/op_prabhuomkar
[link] [comments]
( 42
min )
For example see,
https://gymnasium.farama.org/tutorials/training_agents/reinforce_invpend_gym_v26/
The REINFORCE algorithm takes the state to produce the mean and sd of a normal distribution from which the action is sampled.
state = torch.tensor(np.array([state])) action_means, action_stddevs = self.net(state) # create a normal distribution from the predicted # mean and standard deviation and sample an action distrib = Normal(action_means[0] + self.eps, action_stddevs[0] + self.eps) action = distrib.sample()
In deployment however, wouldn't it make sense to just use action_means directly? I can see reasons to use random sampling in certain environments where a non-deterministic strategy is optimal (like rock-paper-scissors). But generally speaking is taking the action_means directly in deployment a thing?
submitted by /u/JustTaxLandLol
[link] [comments]
( 41
min )
Hi guys,
I have made a video on YouTube here where I discuss about why deep neural networks fail to beat tree-based models on tabular datasets.
I hope it may be of use to some of you out there. As always, feedback is more than welcomed! :)
submitted by /u/Personal-Trainer-541
[link] [comments]
( 40
min )
submitted by /u/DarronFeldstein
[link] [comments]
( 40
min )
https://madgenius.co
submitted by /u/foldedchip
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
Hi all, i saw an ad for an AI service that takes my picture and change it with AI to add bokeh and change background, to make it look profesional-quality for a dating profile.
However i believe it was charging 19$ and i'm sure it could be found for free (An article mentions "BeFake", but it's only for Apple devices.?
submitted by /u/28nov2022
[link] [comments]
( 40
min )
submitted by /u/lambolifeofficial
[link] [comments]
( 40
min )
submitted by /u/qptbook
[link] [comments]
( 40
min )
submitted by /u/MajorMalafunkshun
[link] [comments]
( 40
min )
submitted by /u/Training_Math_4117
[link] [comments]
( 40
min )
I'm trying to find an AI to remove watermarks from imagens like the ones below:
https://i.imgur.com/JlyfJXs.png
https://i.imgur.com/YKU3Qku.png
I already tried almost all online services, and a couple of softwares that must be installed.
The results were all terrible =[
Any suggestons?
submitted by /u/deramack
[link] [comments]
( 40
min )
Hi guys,
I have made a video on YouTube here where I discuss about why deep neural networks fail to beat tree-based models on tabular datasets.
I hope it may be of use to some of you out there. As always, feedback is more than welcomed! :)
submitted by /u/Personal-Trainer-541
[link] [comments]
( 40
min )
submitted by /u/arnolds112
[link] [comments]
( 40
min )
submitted by /u/Calatravo
[link] [comments]
( 40
min )
submitted by /u/yahma
[link] [comments]
( 40
min )
submitted by /u/Pixel2023
[link] [comments]
( 40
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 40
min )
submitted by /u/Imagine-your-success
[link] [comments]
( 45
min )
submitted by /u/allaboutai-kris
[link] [comments]
( 40
min )
submitted by /u/crypto_bubsy
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/short_dude42069
[link] [comments]
( 40
min )
submitted by /u/SrPeixinho
[link] [comments]
( 41
min )
https://github.com/google-research/tuning_playbook - Google has released a playbook (solely) about how to tune hyper-parameters of neural networks.
Disclaimer: I am unrelated to this repository, just came across it and thought it is suitable for this subreddit. I have searched through and found no posts, thus I post it to hear some comments/insights from you ;)
submitted by /u/fzyzcjy
[link] [comments]
( 44
min )
https://time.com/6247678/openai-chatgpt-kenya-workers/
submitted by /u/ChubChubkitty
[link] [comments]
( 54
min )
submitted by /u/WSCOKN
[link] [comments]
( 40
min )
submitted by /u/FreePixelArt
[link] [comments]
( 40
min )
submitted by /u/TheRPGGamerMan
[link] [comments]
( 40
min )
We covered this in our newsletter today. Here it is verbatim-- if you find it useful, hit the link and sub: https://smokingrobot.beehiiv.com/p/ai-wars
Microsoft has dominated BIG TECH headlines over the last few months, thanks largely to a drumbeat of headlines involving their partner OpenAI and its world-shaping ChatGPT.
So awe-striking is Microsoft's hand right now, it has made rival companies' advancements, like Apple's recently announced and insanely powerful M2 MacBook Pros, look pedestrian in comparison.
But now Google has entered the chat.
And by "entered the chat", we mean that CEO Sundar Pichai - Pich-AI? - released a distressingly long 15,000-word(!) treatise on its own endeavors in AI, signaling a counter attack... maybe... at some point in the future... when and if it…
( 45
min )
submitted by /u/PuppetHere
[link] [comments]
( 40
min )
submitted by /u/Marinuch
[link] [comments]
( 40
min )
submitted by /u/much_successes
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/SedatelyMake
[link] [comments]
( 40
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 40
min )
submitted by /u/Ruzuyu
[link] [comments]
( 40
min )
A Good Road Map To Machine Learning enginner
( 14
min )
Time series forecasting is the process of using a model to predict future values of a time series based on its past values.
Continue reading on Becoming Human: Artificial Intelligence Magazine »
( 10
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
This post is co-written by Christopher Diaz, Sam Kinard, Jaime Hidalgo and Daniel Suarez from CCC Intelligent Solutions. In this post, we discuss how CCC Intelligent Solutions (CCC) combined Amazon SageMaker with other AWS services to create a custom solution capable of hosting the types of complex artificial intelligence (AI) models envisioned. CCC is a […]
( 13
min )
Deep-learning model takes a personalized approach to assessing each patient’s risk of lung cancer based on CT scans.
( 10
min )
Reinforcement learning provides a conceptual framework for autonomous agents to learn from experience, analogously to how one might train a pet with treats. But practical applications of reinforcement learning are often far from natural: instead of using RL to learn through trial and error by actually attempting the desired task, typical RL applications use a separate (usually simulated) training phase. For example, AlphaGo did not learn to play Go by competing against thousands of humans, but rather by playing against itself in simulation. While this kind of simulated training is appealing for games where the rules are perfectly known, applying this to real world domains such as robotics can require a range of complex approaches, such as the use of simulated data, or instrumenting real-wo…
( 4
min )
Hi all!
I'd like to share an open source project that I am currently working on together with a few colleagues: DocArray!
If you've ever trained models that deal with different data types (images, text, video, audio, ...) then you know how much of a hassle it can be to keep track of all of your tensors, what shapes they have, and what data they are meant to represent.
That's what we're trying to change with DocArray, a Python library for representing, sending, and storing multi-modal data!
The core idea of DocArray is that you define Documents that represent your data. For example, one Document could hold the file path to an image, its image tensor, and and image embedding that your model creates. A different Document could do the same thing for some Text, and a third Document might co…
( 44
min )
🌟 Synthcity isa library for generating and benchmarking synthetic tabular data. https://github.com/vanderschaarlab/synthcity
🚀 Synthcity includes a wide range of algorithms for various use cases, such as:
- tabular data(CTGAN, TVAE, Bayesian Networks etc)
- survival analysis(SurvivalGAN etc).
- time series(Fourier Flows, TimeGAN, etc.).
- privacy-focused(DP-GAN, PATEGAN, AdsGAN, DECAF).
- domain adaptation(RadialGAN).
🔍 Synthcity supports benchmarking multiple algorithms, testing data quality, downstream performance, statistical fidelity, and privacy metrics.
🌀 Give it a try:
- Library: https://github.com/vanderschaarlab/synthcity
- Tutorial: https://colab.research.google.com/drive/1Vr2PJswgfFYBkJCm3hhVkuH-9dXnHeYV?usp=sharing
- Docs: https://synthcity.readthedocs.io/
submitted by /u/ManagementBig2995
[link] [comments]
( 42
min )
I am currently using the Google Cloud Model Registry and I want to learn what you use for archiving your machine learning models. What are the other options for developers who have to store hundreds of models?
https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-model-registry
submitted by /u/May-is-spring
[link] [comments]
( 42
min )
submitted by /u/McFIyyy
[link] [comments]
( 40
min )
submitted by /u/defensiveFruit
[link] [comments]
( 40
min )
submitted by /u/rafs2006
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/lambolifeofficial
[link] [comments]
( 40
min )
submitted by /u/TheRPGGamerMan
[link] [comments]
( 41
min )
submitted by /u/Klutzy_Accountant113
[link] [comments]
( 40
min )
submitted by /u/HODLTID
[link] [comments]
( 40
min )
submitted by /u/madredditscientist
[link] [comments]
( 41
min )
submitted by /u/bukowski3000
[link] [comments]
( 40
min )
submitted by /u/OnlyProggingForFun
[link] [comments]
( 40
min )
submitted by /u/ohmsalad
[link] [comments]
( 40
min )
Robots are finally getting a grip. Developers have been striving to close the gap on robotic gripping for the past several years, pursuing applications for multibillion-dollar industries. Securely gripping and transferring fast-moving items on conveyor belts holds vast promise for businesses. Soft Robotics, a Bedford, Mass., startup, is harnessing NVIDIA Isaac Sim to help close Read article >
( 6
min )
The Ultimate upgrade begins today: GeForce NOW RTX 4080 SuperPODs are now rolling out, bringing a new level of high-performance gaming to the cloud. Ultimate members will start to see RTX 4080 performance in their region soon, and experience titles like Warhammer 40,000: Darktide, Cyberpunk 2077, The Witcher 3: Wild Hunt and more at ultimate Read article >
( 5
min )
submitted by /u/Imagine-your-success
[link] [comments]
( 55
min )
submitted by /u/_utisz_
[link] [comments]
( 52
min )
submitted by /u/HODLTID
[link] [comments]
( 42
min )
submitted by /u/oridnary_artist
[link] [comments]
( 42
min )
submitted by /u/ai-lover
[link] [comments]
( 43
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 44
min )
submitted by /u/Recent-Dealer-5844
[link] [comments]
( 45
min )
submitted by /u/BackgroundResult
[link] [comments]
( 44
min )
submitted by /u/yaitz331
[link] [comments]
( 43
min )
submitted by /u/PuppetHere
[link] [comments]
( 43
min )
submitted by /u/Sphagne
[link] [comments]
( 46
min )
submitted by /u/NorthTs
[link] [comments]
( 43
min )
submitted by /u/gwern
[link] [comments]
( 53
min )
https://preview.redd.it/k3ims1d6zuca1.png?width=2324&format=png&auto=webp&v=enabled&s=4f4bbe410508bdd4c45f45e55dd5c1ea0fcb5fcc
You must have heard about ChatGPT. Maybe you heard that it was trained with RLHF and PPO. Perhaps you do not really understand how that process works. Then check my Gist on Reinforcement Learning from Human Feedback (RLHF): https://gist.github.com/JoaoLages/c6f2dfd13d2484aa8bb0b2d567fbf093
No hard maths, straight to the point and simplified. Hope that it helps!
submitted by /u/JClub
[link] [comments]
( 57
min )
A new experiential learning opportunity challenges undergraduates across the Greater Boston area to apply their AI skills to a range of industry projects.
( 9
min )
Businesses are moving towards AI Software Products. In fact, a recent study proves this claim by saying that nine out of ten companies…
( 18
min )
I am working with 64x64x64 voxel arrays and am running into significant problems with GPU memory management. I am using TensorFlow and have an NVIDIA GeForce RTX 4080 MSI Ventus edition with 16GB of memory (purchased using research grant funding... it's sitting in a hacked together eGPU setup lol). It performs beautifully on 32x32x32 data but I can't even get started with the larger data format. I have tried limiting GPU data utilization per process, as per this post and limiting memory growth, as per this post (Ctrl+F "second option"). I have 64GB of RAM so I can fit the data into memory (even though I know that's not efficient) and was trying to put that data in a TensorFlow Dataset object, in which, according to the docs, "iteration happens in a streaming fashion, so the full dataset do…
( 62
min )
A few days ago, I responded to a post on LinkedIn about how Google seems to always find a way to keep ahead of the pack, even when someone of importance leaves the company. It occurred to me that NFL teams have to adapt and remake themselves from season to season, as players and coaches… Read More »Ever-Successful vs. Never-Successful: What the NFL Has to Teach Us About Managing Agile Enterprises, Part I
The post Ever-Successful vs. Never-Successful: What the NFL Has to Teach Us About Managing Agile Enterprises, Part I appeared first on Data Science Central.
( 22
min )
For insights into the future of generative AI, check out the latest episode of the NVIDIA AI Podcast. Host Noah Kravitz is joined by Pat Grady and Sonya Huang, partners at Sequoia Capital, to discuss their recent essay, “Generative AI: A Creative New World.” The authors delve into the potential of generative AI to enable Read article >
( 4
min )
As any new mom or dad can tell you, parenting can be a challenge — packed with big worries and small hassles. But it may be about to get a little bit easier thanks to Glüxkind Technologies and their smart stroller, Ella. The company has just been named a CES 2023 Innovation Awards Honoree for Read article >
( 6
min )
To celebrate the upcoming Lunar New Year holiday, NVIDIA artist Zhelong Xu, aka Uncle Light, brought Chinese zodiac signs to life this week In the NVIDIA Studio — modernizing the ancient mythology in his signature style.
( 7
min )
Are you sure your Conversational AI solution is on the right path?
Our chatbot evaluation metrics pinpoint if your solution leveraging the best of the industry’s leading practices, meeting user expectations, and fully taking advantage of the available technology to ensure frictionless and efficient experiences.
https://masterofcode.com/chatbot-analysis-framework
submitted by /u/Marinuch
[link] [comments]
( 57
min )
From the article:
Getty Images is suing Stability AI, creators of popular AI art tool Stable Diffusion, over alleged copyright violation.
In a press statement shared with The Verge, the stock photo company said it believes that Stability AI “unlawfully copied and processed millions of images protected by copyright” to train its software and that Getty Images has “commenced legal proceedings in the High Court of Justice in London” against the firm.
submitted by /u/Wiskkey
[link] [comments]
( 63
min )
Here is a podcast episode with Sugandha Sharma from MIT where we discuss how memories can be implemented, control theory, and much more!
submitted by /u/thejashGI
[link] [comments]
( 44
min )
submitted by /u/SupPandaHugger
[link] [comments]
( 48
min )
submitted by /u/Marinuch
[link] [comments]
( 48
min )
submitted by /u/HODLTID
[link] [comments]
( 44
min )
submitted by /u/oridnary_artist
[link] [comments]
( 43
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 44
min )
What to make of Deepmind’s Sparrow: Is it a sparrow or a hawk? ie a chatGPT killer Recently, Demis Hassabis from DeepMind has been urging caution (DeepMind’s CEO Helped Take AI Mainstream. Now He’s Urging Caution Time magazine/Davos) DeepMind also announced a new chat engine called Sparrow – supposedly a chatGPT killer Sparrow is not… Read More »What to make of Deepmind’s Sparrow: Is it a sparrow or a hawk?
The post What to make of Deepmind’s Sparrow: Is it a sparrow or a hawk? appeared first on Data Science Central.
( 19
min )
The hotel industry is competitive, and it is solely dependent on customer satisfaction. Customers are key. The hotel industry knows this and the importance of the NPS score for customer satisfaction. A better NPS score means satisfied/loyal customers. What hotels have in their control is the website user interface, menu, and providing a seamless customer… Read More »What is a Good Net Promoter Score for the Hotel/Resort Industry?
The post What is a Good Net Promoter Score for the Hotel/Resort Industry? appeared first on Data Science Central.
( 21
min )
By 2025, more than 80% of enterprises will shift from traditional data centers to the cloud or third-party colocation data centers. For most businesses, data is an irreplaceable asset and a key investment area for future growth. Virtual colocation is becoming the talk of how data centers are shifting to adapt to growing business environments.… Read More »7 Reasons Why Fast-Growing Businesses Are Turning to Virtual Colocation in 2023
The post 7 Reasons Why Fast-Growing Businesses Are Turning to Virtual Colocation in 2023 appeared first on Data Science Central.
( 20
min )
Amazon SageMaker Studio is a fully integrated development environment (IDE) for machine learning (ML) partly based on JupyterLab 3. Studio provides a web-based interface to interactively perform ML development tasks required to prepare data and build, train, and deploy ML models. In Studio, you can load data, adjust ML models, move in between steps to adjust experiments, […]
( 6
min )
Amazon SageMaker JumpStart is the Machine Learning (ML) hub of SageMaker providing pre-trained, publicly available models for a wide range of problem types to help you get started with machine learning. Understanding customer behavior is top of mind for every business today. Gaining insights into why and how customers buy can help grow revenue. Customer churn is […]
( 14
min )
submitted by /u/oridnary_artist
[link] [comments]
( 53
min )
In their largest-ever joint AI initiative, NVIDIA and Dell Technologies today launched a wave of Dell PowerEdge systems available with NVIDIA acceleration, enabling enterprises to efficiently transform their businesses with AI. A total of 15 next-generation Dell PowerEdge systems can draw from NVIDIA’s full AI stack — including GPUs, DPUs and the NVIDIA AI Enterprise Read article >
( 5
min )
Sponsored Post Attend the Data Science Symposium 2022 on November 8 The Center for Business Analytics at the University of Cincinnati will present its annual Data Science Symposium 2022 on November 8. This all day in-person event will have three featured speakers and two tech talk tracks with four concurrent presentations in each track. The […]
The post Attend the Data Science Symposium 2022, November 8 in Cincinnati appeared first on Machine Learning Mastery.
( 10
min )